my dataset has following form:
df<- data.frame(c("a", "a", "a", "a", "a", "a", "a", "a", "b", "b", "b", "b", "b", "b", "b", "b"), c(1, 1, 1, 1, 2, 2, 2, 2, 1, 1, 1, 1, 2, 2, 2, 2), c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3 , 4, 1, 2, 3, 4), c(25, 75, 20, 40, 60, 50, 20, 10, 20, 30, 40, 60, 25, 75, 20, 40)) colnames(df)<-c("car", "year", "mnth", "val")
for clarity show here well:
car year mnth val 1 1 1 25 2 1 2 75 3 1 3 20 4 1 4 40 5 2 1 60 6 2 2 50 7 2 3 20 8 2 4 10 9 b 1 1 20 10 b 1 2 30 11 b 1 3 40 12 b 1 4 60 13 b 2 1 25 14 b 2 2 75 15 b 2 3 20 16 b 2 4 40
i add new column tmp
df
where, particular row, value of tmp
should average of df$val
, 3 preceeding values. examples of tmp
shown here
#row 3: mean(25,75,20)=40 #row 4: mean(25,75,20,40)=40 #row 5: mean(75,20,40,60)=48.75 #row 16: mean(25,75,20,40)=40
is there efficient way in r without using for
-loops?
for each value, calculate mean of rolling window includes value preceding 3 values (from index i-3
index i
in solution below). cases when i-3
negative, can use 0
(max((i-3),0)
)
sapply(seq_along(df$val), function(i) mean(df$val[max((i-3),0):i], na.rm = true)) #[1] 25.00 50.00 40.00 40.00 48.75 42.50 42.50 35.00 25.00 #[10] 20.00 25.00 37.50 38.75 50.00 45.00 40.00
also consider rollmean
of zoo
library(zoo) c(rep(na,3), rollmean(x = df$val, k = 4)) #[1] na na na 40.00 48.75 42.50 42.50 35.00 25.00 20.00 25.00 #[12] 37.50 38.75 50.00 45.00 40.00 #further tweaking may necessary
Comments
Post a Comment