how to avoid of using nested lapply in R?

Question

I am seeking efficient alternative for nested lapply, I think using nested structure is not appreciated in R community. Can anyone propose possible ideas, or approach to avoid of using nest lapply in custom function?

Here is quick reproducible example:

simulated Data

a <- data.frame(
  start=seq(1, by=9, len=18), stop=seq(6, by=9, len=18),
  ID=letters[seq(1:18)], score=sample(1:25, 18, replace = FALSE))
b <- data.frame(
  start=seq(2, by=11, len=20), stop=seq(8, by=11, len=20),
  ID=letters[seq(1:20)], score=sample(1:25, 20, replace = FALSE))
c <- data.frame(
  start=seq(4, by=11, len=25), stop=seq(9, by=11, len=25),
  ID=letters[seq(1:25)], score=sample(1:25, 25, replace = FALSE))

function that I used nested lapply, but want to avoid this:

a.big <- a[a$score >10,]
a.sml <- a[(a$score > 6 & a$score <= 10),]
a.non <- a[a$score < 6,]

a_new <- list('big'=a.big, 'sml'=a.sml)
tar.list <- list(b,c)

test <- lapply(a_new, function(ele_) {
  re <- lapply(tar.list, function(li) {
    out <- base::setdiff(ele_, li)
    return(out)
  })
})

objective:

avoid of using nested lapply, to find its efficient alternative. I mean to find better representation for its output which must be easy/fast to reproduce, and allow fast/easy downstream computation. Is there any general approach to do this?

How to avoid of using nested lapply in test? Can anyone propose possible ideas to get through this issues ? Thanks

Best regards:

Jeff

Roman · Accepted Answer

I'm not sure what you really want. But if you like setdiff of all combinations of both lists, then you can use something like this:

# all combinations
a <- expand.grid(seq_along(a_new), seq_along(tar.list))
a
  Var1 Var2
1    1    1
2    2    1
3    1    2
4    2    2
# apply over all combinations setdiff row-vice 
apply(a, 1, function(x, y, z){ setdiff(y[x[1]], z[x[2]])}, a_new, tar.list)[1:2]
[[1]]
[[1]][[1]]
   start stop ID score
2     10   15  b    21
3     19   24  c    12
6     46   51  f    23
9     73   78  i    15
10    82   87  j    19
11    91   96  k    25
13   109  114  m    11
16   136  141  p    17
17   145  150  q    18
18   154  159  r    24


[[2]]
[[2]][[1]]
   start stop ID score
5     37   42  e     9
14   118  123  n     8
15   127  132  o     7

Using double [[]] brakets gives you a cleaner output of only one list.

apply(a, 1, function(x, y, z){ setdiff(y[[x[1]]],z[[x[2]]])}, a_new, tar.list)

[[1]]
   start stop ID score
2     10   15  b    21
3     19   24  c    12
6     46   51  f    23
9     73   78  i    15
10    82   87  j    19
11    91   96  k    25
13   109  114  m    11
16   136  141  p    17
17   145  150  q    18
18   154  159  r    24

[[2]]
   start stop ID score
5     37   42  e     9
14   118  123  n     8
15   127  132  o     7

[[3]]
   start stop ID score
2     10   15  b    21
3     19   24  c    12
6     46   51  f    23
9     73   78  i    15
10    82   87  j    19
11    91   96  k    25
13   109  114  m    11
16   136  141  p    17
17   145  150  q    18
18   154  159  r    24

[[4]]
   start stop ID score
5     37   42  e     9
14   118  123  n     8
15   127  132  o     7

how to avoid of using nested lapply in R?

simulated Data

function that I used nested lapply, but want to avoid this:

objective:

Answers (2)

Edit

Related Questions