Reputation: 429
Consider the following simple dataset ds
:
ds <- data.frame("x"=c(1,2,3), "y"=c(5,5,5))
I apply a function on some columns of ds like x
and y
and create two new variables named xnew
and ynew
. It works well:
ds[,c("xnew","ynew")] <- lapply(ds[,c("x","y")], function(x) x^2)
But suppose there ist some undefined column names like z
! In this case I get the error "undefined columns selected"
and nither xnew
nor ynew
were created.
Is there any way to skip this error and create xnew
and ynew
and get only an error for znew
? (something like trycatch
by for-loops
)
ds[,c("xnew","ynew","znew")] <- lapply(ds[,c("x","y","z")], function(x) x^2)
Error in `[.data.frame`(ds, , c("x", "y", "z")) :
undefined columns selected
Upvotes: 1
Views: 214
Reputation: 1364
You can define the lapply argument columns (oldvars) as the intersection between the column names of ds (x, y) and a vector that may include undefined column names (x, y, z). For the record, the data.table package incorporates an elegant internal lapply functionality which will be faster than base R for large datasets.
Code
ds = data.table(ds)
oldvars = intersect(c('x', 'y', 'z'), colnames(ds))
newvars = paste0(oldvars, '_new')
ds[, (newvars) := lapply(.SD, function(x) x^2), .SDcols = oldvars]
The last line applies the lapply statement onto a data.table subset (.SD), whereby the subset columns are declared using the .SDcols argument (in this case, x and y).
Using base R instead of data.table (from OPs comment):
ds[ ,newvars] <- lapply(ds[ ,oldvars], function(x) x^2)
Result:
> ds
x y x_new y_new
1: 1 5 1 25
2: 2 5 4 25
3: 3 5 9 25
Upvotes: 2