Fateta
Fateta

Reputation: 429

skipping error "undefined columns selected" in lapply

Consider the following simple dataset ds:

ds <- data.frame("x"=c(1,2,3), "y"=c(5,5,5))

I apply a function on some columns of ds like x and y and create two new variables named xnew and ynew. It works well:

ds[,c("xnew","ynew")] <- lapply(ds[,c("x","y")], function(x) x^2)

But suppose there ist some undefined column names like z! In this case I get the error "undefined columns selected" and nither xnew nor ynew were created. Is there any way to skip this error and create xnew and ynew and get only an error for znew? (something like trycatch by for-loops)

    ds[,c("xnew","ynew","znew")] <- lapply(ds[,c("x","y","z")], function(x) x^2)

    Error in `[.data.frame`(ds, , c("x", "y", "z")) : 
    undefined columns selected

Upvotes: 1

Views: 214

Answers (1)

JDG
JDG

Reputation: 1364

You can define the lapply argument columns (oldvars) as the intersection between the column names of ds (x, y) and a vector that may include undefined column names (x, y, z). For the record, the data.table package incorporates an elegant internal lapply functionality which will be faster than base R for large datasets.

Code

ds = data.table(ds)

oldvars = intersect(c('x', 'y', 'z'), colnames(ds))
newvars = paste0(oldvars, '_new')

ds[, (newvars) := lapply(.SD, function(x) x^2), .SDcols = oldvars]

The last line applies the lapply statement onto a data.table subset (.SD), whereby the subset columns are declared using the .SDcols argument (in this case, x and y).

Using base R instead of data.table (from OPs comment):

ds[ ,newvars] <- lapply(ds[ ,oldvars], function(x) x^2)

Result:

> ds
   x y x_new y_new
1: 1 5     1    25
2: 2 5     4    25
3: 3 5     9    25

Upvotes: 2

Related Questions