i have 2 data frames. 1 data frame contains the time of net set (beginning) , lift (end). need use time period calculate mean second data frame. having issues matching times because times in catch data frame specific, whereas times in data frame hourly. tried matching closest time couldn't manage that, next thought use within time interval. mean needs specific boat , net within time interval. doing several large data sets need able loop function rather code each line.
catchdf:
boat net set lift dawn 26 2016-05-19 12:20:11 2016-05-27 11:48:36 nip 26 2016-07-28 07:25:47 2016-08-07 06:13:10 dawn 26.3 2016-08-01 13:24:51 2016-08-03 07:48:52
dodf:
time boat net 2016-05-19 13:00:00 10.2 dawn 26 2016-05-21 15:00:00 10.4 dawn 26 2016-05-26 09:00:00 10.9 dawn 26 2016-05-28 10:00:00 9.4 dawn 26 2016-07-28 09:00:00 11.9 nip 26 2016-07-28 19:00:00 12.4 nip 26 2016-08-04 04:00:00 5.4 nip 26 2016-08-01 05:00:00 13.2 dawn 26.3 2016-08-02 16:00:00 12.3 dawn 26.3 2016-08-05 22:00:00 2.4 dawn 26.3
my goal insert column catchdf meando each time period.
boat net set lift meando dawn 26 2016-05-19 12:20:11 2016-05-27 11:48:36 10.50 nip 26 2016-07-28 07:25:47 2016-08-07 06:13:10 9.90 dawn 26.3 2016-08-01 13:24:51 2016-08-03 07:48:52 12.3
this code used make sample data frames:
catchdf <- structure(list(boat = c("dawn", "nip", "dawn"), net = c("26", "26", "26.3"), set = c("2016-05-19 12:20:11", "2016-07-28 07:25:47", "2016-08-01 13:24:51"), lift = c("2016-05-27 11:48:36", "2016-08-07 06:13:10", "2016-08-03 07:48:52")), .names = c("boat", "net", "set", "lift"), class = "data.frame", row.names = c(na, -3l)) dodf <- structure(list(time = c("2016-05-19 13:00:00", "2016-05-21 15:00:00", "2016-05-26 09:00:00", "2016-05-28 10:00:00", "2016-07-28 09:00:00", "2016-07-28 19:00:00", "2016-08-04 04:00:00", "2016-08-01 05:00:00", "2016-08-02 16:00:00", "2016-08-05 22:00:00"), = c("10.2", "10.4", "10.9", "9.4", "11.9", "12.4", "5.4", "13.2", "12.3", "2.4"), boat = c("dawn", "dawn", "dawn", "dawn", "nip", "nip", "nip", "dawn", "dawn", "dawn"), net = c("26", "26", "26", "26", "26", "26", "26", "26.3", "26.3", "26.3")), .names = c("time", "do", "boat", "net"), class = "data.frame", row.names = c(na, -10l))
i've been stuck on 1 quite while , haven't been able make progress. appreciated.
here possible data.table solution. first, fix formats of columns
library(data.table) #v1.10.4 cols <- c("set", "lift") setdt(catchdf)[, (cols) := lapply(.sd, as.posixct), .sdcols = cols] setdt(dodf)[, `:=`(time = as.posixct(time), = as.numeric(do))]
then can make non-equi join between data sets while calculating mean on fly
dodf[catchdf, .(meando = mean(do)), on = .(boat, net, time > set, time < lift), = .eachi] # boat net time time meando # 1: dawn 26 2016-05-19 12:20:11 2016-05-27 11:48:36 10.5 # 2: nip 26 2016-07-28 07:25:47 2016-08-07 06:13:10 9.9 # 3: dawn 26.3 2016-08-01 13:24:51 2016-08-03 07:48:52 12.3
Comments
Post a Comment