rnorouzian
rnorouzian

Reputation: 7517

Subsetting in a second level R function

Function foo1 can subset a list by a requested variable (e.g., by = type == 1). Otherwise, foo1 will simply output the inputted list itself.

For my purposes, I need to use foo1 within a new function called foo2.

In my code below, my desired output is obtained like so: foo2(data = D, by = G[[1]]) ; foo2(data = D, by = G[[2]]) ; foo2(data = D, by = G[[3]]).

But, I wonder why when I loop over G using lapply, I get an error as shown below?

foo1 <- function(data, by){

  L <- split(data, data$study.name) ; L[[1]] <- NULL

  if(!missing(by)){

   L <- lapply(L, function(x) do.call("subset", list(x, by)))
  }
 return(L)
}


foo2 <- function(data, by){

  eval(substitute(foo1(data = data, by = by)))
}

## EXAMPLE OF USE:
D <- read.csv("https://raw.githubusercontent.com/izeh/i/master/k.csv", h = T) ## Data

G <- lapply(unique(na.omit(D$type)), function(i) bquote(type == .(i)))# all levels of `type`

foo2(data = D, by = G[[1]]) # Works fine without `lapply` :-)

lapply(1:3, function(i) foo2(data = D, by = G[[i]])) # Doesn't work with `lapply`! :-(
# Error in do.call("subset", list(x, by)) : object 'i' not found

Upvotes: 2

Views: 129

Answers (2)

akrun
akrun

Reputation: 887108

Instead of using lapply, here a for loop can be used

lst1 <- vector("list", length(G))
for(i in 1:3) lst1[[i]] <- foo2(data = D, by = G[[i]])

-checking

identical(lst1[[2]],  foo2(data = D, by = G[[2]]))
#[1] TRUE
identical(lst1[[3]],  foo2(data = D, by = G[[3]]))
#[1] TRUE

For the lapply part, there seems to be a conflict with i anonymous function which is also called in the G. If we use a new variable say 'j'

lst2 <- lapply(1:3, function(j) foo1(data = D, by = G[[j]]))

should work

identical(lst2[[2]], lst1[[2]])
#[1] TRUE

Upvotes: 1

user2554330
user2554330

Reputation: 44867

Your foo2 function tries to evaluate the expression

foo1(data = D, by = G[[i]])

but it doesn't have i available. You need to evaluate G[[i]] in the anonymous function you're passing to lapply to get an expression defining the subset, and then evaluate that subset in foo2. I recommend naming that function instead of using an anonymous one; it makes debugging a lot easier.

Here's some recoding that appears to work:

Redefine foo2 to

foo2 <- function(data, by){
  by <- eval(by, envir = data)
  foo1(data = data, by = by)
}

and

foo3 <- function(i) {
    expr <- G[[i]]
    foo2(data = D, by = expr)
}

and then

lapply(1:3, foo3)

I'm not sure this does exactly what you want, but it should be close enough that you can fix it up.

Upvotes: 2

Related Questions