i'm new r , have searched on google solution below problem.
i have
dt = data.table(y=c("a",na,na), y_1=c(na,3,6), y_2=c(1,na,3), y_3=c(1,1,1)).
i want create function passing datatable , column needs changed.
fun <- function(dt, var) { dt[,(var) := ifelse(!(is.na(get(var))), get(paste0(var,"1")), ifelse(!(is.na(get(paste0(var,"1")), get(paste0(var,"2")...))] return(dt) }
i want replace values in y
variable na
's values in y_1
if not null
or else replace y_2
, on. want create function can accept different variables same ending.
update: uwe, pointing previous question. found pretty useful. but, requirement different. need same update other variables values na. example, need (x,x_1,x_2,x_3...),(z,z_1,z_2,z_3..) , other variables apart y. there way use lapply or function that.
thanks in advance.
the op looking variant of locf
method (last observation carried forward) implemented zoo::na.locf()
instance. while na.locf()
applied on vector or column of data.frame, op looking variant applied on each row of data.table
restricted specific subset of columns. so, function being named na.locl()
(last observation carried left).
in addition, data.table
updated in place, e.g., without copying. columns named in specific manner, e.g., x
, x_1
, x_2
, x_3
, etc. so, x
kind of base name subset of columns.
the function below in each row of specific subset of columns of given data.table
first non-na
column , copies value column x
.
the implementation based on this solution. includes plausibilty checks.
na.locl <- function(var, dt) { checkmate::assert_data_table(dt) checkmate::assert_string(var) checkmate::assert_choice(var, names(dt)) ans_val = rep_len(na_real_, nrow(dt)) selected_cols <- unlist(lapply( var, function(x) stringr::str_subset(names(dt), paste0("^", x, "(_\\d*)?$")))) for(col in selected_cols) { = is.na(ans_val) & (!is.na(dt[[col]])) ans_val[i] = dt[[col]][i] } set(dt, , var, ans_val) return(invisible(null)) }
in addition, op has requested repeat other variables. can accomplished using lapply()
na.locl()
function. demonstrate this, sample data required.
library(data.table) dt0 <- data.table(y=c("a",na,na,na), y_1=c(na,3,na,na), y_2=c(1,na,3,na), y_3=c(1,1,1,na)) dt <- cbind(dt0, setnames(copy(dt0), stringr::str_replace(names(dt0), "^y", "x"))) dt <- cbind(dt, setnames(copy(dt0), stringr::str_replace(names(dt0), "^y", "zzz"))) dt # y y_1 y_2 y_3 x x_1 x_2 x_3 zzz zzz_1 zzz_2 zzz_3 #1: na 1 1 na 1 1 na 1 1 #2: na 3 na 1 na 3 na 1 na 3 na 1 #3: na na 3 1 na na 3 1 na na 3 1 #4: na na na na na na na na na na na na
y
, x
, , zzz
na
except row 1. after applying function on dt,
dummy <- lapply(c("x", "y", "zzz"), na.locl, dt = dt) dt # y y_1 y_2 y_3 x x_1 x_2 x_3 zzz zzz_1 zzz_2 zzz_3 #1: na 1 1 na 1 1 na 1 1 #2: 3 3 na 1 3 3 na 1 3 3 na 1 #3: 3 na 3 1 3 na 3 1 3 na 3 1 #4: na na na na na na na na na na na na
the missing values in columns y
, x
, , zzz
have been replaced next non-na
value right if available within subset of columns. thus, row 4 na
no non-na
(that's 3 negations in row) available in each of column subsets.
Comments
Post a Comment