tflutre
tflutre

Reputation: 3546

In R, how to filter lists of lists?

According to the manual, Filter works on vectors, and it happens to work also on lists, eg.:

z <- list(a=1, b=2, c=3)
Filter(function(i){
  z[[i]] > 1
}, z)
$b
[1] 2

$c
[1] 3

However, it doesn't work on lists of lists, eg.:

z <- list(z1=list(a=1,b=2,c=3), z2=list(a=1,b=1,c=1), z3=list())
Filter(function(i){
  if(length(z[[i]])>0){
    if(z[[i]]$b > 1)
      TRUE
    else
      FALSE
  }
  else
    FALSE
}, z)
Error in z[[i]] : invalid subscript type 'list'

What is the best way then to filter lists of lists without using nested loops? It could also be lists of lists of lists...

(I tried with nested lapply's instead, but couldn't manage to make it work.)

Edit: in the 2nd example, here is what I want to obtain:

list(z1=list(a=1,b=2,c=3))

that is, without z$z2 because z$z2$b < 1, and without z$z3 because it is empty.

Upvotes: 11

Views: 16398

Answers (5)

Nick
Nick

Reputation: 3374

The modern tidy solution to this problem would be:

library(tidyverse)
z <- list(z1=list(a=1,b=2,c=3), z2=list(a=1,b=1,c=1), z3=list())

Then simply:

tibble(disc = z, Names = names(z)) %>% 
  hoist(disc, c = "c") %>%
  filter(c == 3) %>%
  unnest_wider(disc) %>% 
  split(.$Names) %>% map(select, -Names) %>% 
  map(as.list)

Note this is now super flexible, and easily allows other filtering, e.g. if a == 1

Upvotes: 4

Phil
Phil

Reputation: 1194

Filter sub list by key. Written in reading the answers which help me.

zall<-list(z1=list(list(key=1,b=2,c=3),list(key=2,b=3,c=4)))
zall
#> $z1
#> $z1[[1]]
#> $z1[[1]]$key
#> [1] 1
#> 
#> $z1[[1]]$b
#> [1] 2
#> 
#> $z1[[1]]$c
#> [1] 3
#> 
#> 
#> $z1[[2]]
#> $z1[[2]]$key
#> [1] 2
#> 
#> $z1[[2]]$b
#> [1] 3
#> 
#> $z1[[2]]$c
#> [1] 4
lapply(zall$z1, function(x){ x[intersect(names(x),"key")]  } )
#> [[1]]
#> [[1]]$key
#> [1] 1
#> 
#> 
#> [[2]]
#> [[2]]$key
#> [1] 2
lapply(zall$z1, function(x){ x[setdiff(names(x),"key")]  } )
#> [[1]]
#> [[1]]$b
#> [1] 2
#> 
#> [[1]]$c
#> [1] 3
#> 
#> 
#> [[2]]
#> [[2]]$b
#> [1] 3
#> 
#> [[2]]$c
#> [1] 4

Upvotes: 0

S4M
S4M

Reputation: 4661

I think you should use:

Filter(function(x){length(x)>0 && x[["b"]] > 1},z)

The predicate (the function you are using to filter z) applies to the elements of z, not their indexes.

Upvotes: 8

JD Long
JD Long

Reputation: 60746

I had never used Filter prior to your question, so this was a good exercise for first thing in the morning :)

There are at least a couple of things going on that are tripping you up (I think).

Let's start with your first simple anonymous function, but let's make it stand alone so it's easier to read:

f <- function(i){
        z[[i]] > 1
     }

It should jump out at you that this function takes one argument, i, yet in the function it calls z. That's not very good "functional" programming :)

So start by changing that function to:

f <- function(i){
        i > 1
     }

And you'll see Filter will actually run against a list of lists:

 z <- list(z1=list(a=1,b=2,c=3), z2=list(a=1,b=1,c=1))
 Filter( f, z)

but it returns:

> Filter( f, z)
$z2
$z2$a
[1] 1

$z2$b
[1] 1

$z2$c
[1] 1


$<NA>
NULL

which isn't exactly what you want. Honestly I can't grok why it returns that result, maybe someone can explain it to me.

@DWin was barking up the right tree when he said that there should be a recursive solution. I hacked up a first stab at a recursive function, but you'll need to improve on it:

fancyFilter <- function(f, x){
  if ( is.list( x[[1]] ) ) #only testing the first element... bad practice
    lapply( x, fancyFilter, f=f ) #recursion FTW!!
  else
    return( lapply(x, Filter, f=f ) )
}

fancyFilter looks at the first element of the x passed to it and if that element is a list, it recursively calls fancyFilter on each element of the list. But what if element #2 is not a list? That's the sort of thing you should test and tease out whether it matters for you. But the result of fancyFilter seems to look like what you are after:

> fancyFilter(f, z)
$z1
$z1$a
numeric(0)

$z1$b
[1] 2

$z1$c
[1] 3


$z2
$z2$a
numeric(0)

$z2$b
numeric(0)

$z2$c
numeric(0)

You may want to add some logic to clean up the output so the FALSE results don't get molested into numeric(0). And, obviously, I did an example using only your simple function, not the more complex function you used in the second example.

Upvotes: 2

IRTFM
IRTFM

Reputation: 263352

No claims for beauty here and it does not do a depth search:

z2 <- lapply(z, function(x){ if( "b" %in% names(x) && x[["b"]] >1 ) x else {}   } )
z2[unlist(lapply(z2, is.null))] <- NULL

> z2
$z1
$z1$a
[1] 1

$z1$b
[1] 2

$z1$c
[1] 3

EDIT: This code will traverse a list and assemble the nodes that have 'b' > 1. It needs some work to properly label the nodes. First a list with deeper nesting:

z <- list(z1=list(a=1,b=2,c=3), z2=list(a=1,b=1,c=1), z3=list(),
          z4 = list(z5=list(a=5,b=6,c=7), z6=list(a=7,b=8,c=9)))

checkbGT1 <- function(ll){ root <- list()
             for(i in seq_along(ll) ) {if ("b" %in% names(ll[[i]]) && ll[[i]]$b >1) {
                                 root <- c(root, ll[[i]]) 
                                 }else{ 
                                 if(  length(ll[[i]]) && is.list(ll[[i]]) ) 
                                    { root <- c(root, list(checkbGT1( ll[[i]] ))) }
                                          } 
                                       } 
                  return(root) }

Upvotes: 0

Related Questions