conflictcoder
conflictcoder

Reputation: 403

How can I update a pre-defined list of columns in data.table without dropping the others?

I would like to update certain columns in a data.table without being overly verbose. Here's an example that does almost what I want:

DT <- data.table(A=1:4, B=3:6, C=rep(1,4), id = c(1,1,2,2))
DT[2,1] <- NA
DT[3,2] <- NA
DT[4,3] <- NA
cols_to_change <- c("A","B")
DT <- DT[,nafill(.SD, "locf"), by=id, .SDcols = cols_to_change]

The only problem is that column "C" gets dropped, and the names of "A" and "B" are changed. In reality, I have a lot more columns to change, and I'd like to run two update functions (locf and nocb), so it makes sense to list them all in a cols_to_change vector rather than listing all of them repeatedly in each update function. I assume there's some way to do this with := that I don't quite grasp, or perhaps with dyplr's group_by and mutate functions. In any case, I'm open to whatever works.

Upvotes: 1

Views: 98

Answers (1)

akrun
akrun

Reputation: 887971

We need to update the columns with := by specifying the 'cols_to_change' (on the lhs) within the () to evaluate the value inside the object instead of literally evaluating it

DT[,(cols_to_change) := nafill(.SD, "locf"), by=id, .SDcols = cols_to_change]

Upvotes: 4

Related Questions