statsNoob
statsNoob

Reputation: 1355

How to determine which list elements contains a record in R

If you have a list of vectors, what is a good way of determining which list elements contain a specific record?

set.seed(8675309)
aList <- list(v1=sample(LETTERS, 20), 
              v2=sample(LETTERS, 10))

The output of aList looks like this:

 > aList
$v1
 [1] "E" "L" "S" "R" "F" "O" "T" "Q" "P" "H" "N" "I" "X" "D" "U" "K" "W" "B" "G" "V"

$v2
 [1] "B" "V" "U" "H" "M" "O" "F" "Z" "C" "N"

I want something like this:

which("B" %in% aList[names(aList)]) # It would be nice if this returned v1 and v2
which("E" %in% aList[names(aList)]) # It would be nice if this returned v1
which("C" %in% aList[names(aList)]) # It would be nice if this returned v2

Upvotes: 3

Views: 1702

Answers (2)

akrun
akrun

Reputation: 886948

Instead of doing this individually, we can use outer with %in% to get the logical matrix ('m1'), split it by row and get the corresponding names of 'aList'.

v1 <- c('B', 'E', 'C')
m1 <- outer(v1, aList, FUN= Vectorize(`%in%`))
lapply(split(m1, row(m1)), function(x) names(aList)[x])
# $`1`
#[1] "v1" "v2"

#$`2`
#[1] "v1"

#$`3`
#[1] "v2"

Or we melt the 'm1' and split by 'long' format column.

library(reshape2)
with(melt(m1), split(as.character(Var2[value]), Var1[value]))

Upvotes: 3

etienne
etienne

Reputation: 3678

names(aList)[sapply(1:2,function(x){"B" %in% aList[[x]]})]
[1] "v1" "v2" 

names(aList)[sapply(1:2,function(x){"E" %in% aList[[x]]})]
[1] "v1"

names(aList)[sapply(1:2,function(x){"C" %in% aList[[x]]})]
[1] "v2"

If you have a list with an unknown numbers of element, use seq_along :

names(aList)[sapply(seq_along(aList),function(x){"B" %in% aList[[x]]})]
[1] "v1" "v2"

Here is a microbenchmark regarding the comments.

microbenchmark(seq_along(aList),seq_along(names(aList)),1:length(aList),times=100000)
Unit: nanoseconds
                    expr min  lq     mean median   uq    max neval cld
        seq_along(aList) 350 700 659.9117    701  701 208228 1e+05 a  
 seq_along(names(aList)) 351 701 857.1508    701 1051 216977 1e+05  b 
         1:length(aList) 700 701 935.7251   1050 1051 424855 1e+05   c

microbenchmark(etienne(),roland())
Unit: microseconds
      expr    min     lq     mean median     uq     max neval cld
 etienne() 40.597 41.297 45.24751 41.646 41.997 211.378   100   b
  roland() 12.600 13.300 14.40882 14.699 15.049  20.998   100  a 

Upvotes: 3

Related Questions