r - Use time interval to calculate mean from another data frame -


i have 2 data frames. 1 data frame contains the time of net set (beginning) , lift (end). need use time period calculate mean second data frame. having issues matching times because times in catch data frame specific, whereas times in data frame hourly. tried matching closest time couldn't manage that, next thought use within time interval. mean needs specific boat , net within time interval. doing several large data sets need able loop function rather code each line.

catchdf:

boat    net   set                    lift dawn    26    2016-05-19 12:20:11    2016-05-27 11:48:36 nip     26    2016-07-28 07:25:47    2016-08-07 06:13:10 dawn    26.3  2016-08-01 13:24:51    2016-08-03 07:48:52 

dodf:

time                        boat    net 2016-05-19 13:00:00    10.2    dawn    26 2016-05-21 15:00:00    10.4    dawn    26 2016-05-26 09:00:00    10.9    dawn    26 2016-05-28 10:00:00    9.4     dawn    26 2016-07-28 09:00:00    11.9    nip     26 2016-07-28 19:00:00    12.4    nip     26 2016-08-04 04:00:00    5.4     nip     26 2016-08-01 05:00:00    13.2    dawn    26.3 2016-08-02 16:00:00    12.3    dawn    26.3 2016-08-05 22:00:00    2.4     dawn    26.3 

my goal insert column catchdf meando each time period.

boat    net   set                    lift                  meando dawn    26    2016-05-19 12:20:11    2016-05-27 11:48:36   10.50 nip     26    2016-07-28 07:25:47    2016-08-07 06:13:10   9.90 dawn    26.3  2016-08-01 13:24:51    2016-08-03 07:48:52   12.3 

this code used make sample data frames:

catchdf <- structure(list(boat = c("dawn", "nip", "dawn"), net = c("26", "26", "26.3"), set = c("2016-05-19 12:20:11", "2016-07-28 07:25:47", "2016-08-01 13:24:51"), lift = c("2016-05-27 11:48:36", "2016-08-07 06:13:10", "2016-08-03 07:48:52")), .names = c("boat", "net", "set", "lift"), class = "data.frame", row.names = c(na, -3l))  dodf <- structure(list(time = c("2016-05-19 13:00:00", "2016-05-21 15:00:00", "2016-05-26 09:00:00", "2016-05-28 10:00:00", "2016-07-28 09:00:00", "2016-07-28 19:00:00", "2016-08-04 04:00:00", "2016-08-01 05:00:00", "2016-08-02 16:00:00", "2016-08-05 22:00:00"), = c("10.2", "10.4", "10.9", "9.4", "11.9", "12.4", "5.4", "13.2", "12.3", "2.4"), boat = c("dawn", "dawn", "dawn", "dawn", "nip", "nip", "nip", "dawn", "dawn", "dawn"), net = c("26", "26", "26", "26", "26", "26", "26", "26.3", "26.3", "26.3")), .names = c("time", "do", "boat", "net"), class = "data.frame", row.names = c(na, -10l)) 

i've been stuck on 1 quite while , haven't been able make progress. appreciated.

here possible data.table solution. first, fix formats of columns

library(data.table) #v1.10.4 cols <- c("set", "lift") setdt(catchdf)[, (cols) := lapply(.sd, as.posixct), .sdcols = cols] setdt(dodf)[, `:=`(time = as.posixct(time), = as.numeric(do))] 

then can make non-equi join between data sets while calculating mean on fly

dodf[catchdf, .(meando = mean(do)), on = .(boat, net, time > set, time < lift), = .eachi] #    boat  net                time                time meando # 1: dawn   26 2016-05-19 12:20:11 2016-05-27 11:48:36   10.5 # 2:  nip   26 2016-07-28 07:25:47 2016-08-07 06:13:10    9.9 # 3: dawn 26.3 2016-08-01 13:24:51 2016-08-03 07:48:52   12.3 

Comments