user2498497
user2498497

Reputation: 703

Remove NULL element and unlist the local level list from a nested list in R

Suppose I have a list, say it has three levels:

tmp =list(list(list(c(2,9,10), NULL), c(1,3,4,6)), 7) 

This would output

[[1]]
[[1]][[1]] 
[[1]][[1]][[1]]
[1]  2  9 10

[[1]][[1]][[2]]
NULL

[[1]][[2]]
[1] 1 3 4 6

[[2]]
[1] 7

I would like to remove the NULL element and the local level of the list. i.e, the nested list tmp has only 2 levels and it becomes

tmp =list(list(c(2,9,10), c(1,3,4,6)), 7). 

That is, the desired output would either be the following:

tmp
[[1]]
[[1]][[1]]
[1]  2  9 10

[[1]][[2]]
[1] 1 3 4 6

[[2]]
[1] 7

I have tried to search for the index position of NULL but without luck. Furthermore, I am not sure how to detect and unlist the list that contains the NULL element within the list. Thanks!

Upvotes: 11

Views: 3008

Answers (3)

Joris C.
Joris C.

Reputation: 6234

Update June 2020: this can now also be done with rrapply in the rrapply-package (a revised version of base rrapply). Using how = "prune", we can prune all NULL elements from the list while keeping the original list structure. For instance, using the same list objects as in Beasterfield's response:

library(rrapply)

## Example 1 
tmp <- list(list(list(c(2,9,10), NULL), c(1,3,4,6)), 7) 

## keep only non-NULL leafs
rrapply(tmp, condition = Negate(is.null), how = "prune")
#> [[1]]
#> [[1]][[1]]
#> [[1]][[1]][[1]]
#> [1]  2  9 10
#> 
#> 
#> [[1]][[2]]
#> [1] 1 3 4 6
#> 
#> 
#> [[2]]
#> [1] 7


## Example 2
tree <- list(
    list(
        list(
            list(
                list( NULL, NULL ),
                list( NULL, NULL )
            ),
            7
        ),
        list(
            list(
                list( c(1,2), NULL ),
                c(3,4)
            ))))

## branches with only NULLs are completely pruned
rrapply(tree, condition = Negate(is.null), how = "prune")
#> [[1]]
#> [[1]][[1]]
#> [[1]][[1]][[1]]
#> [1] 7
#> 
#> 
#> [[1]][[2]]
#> [[1]][[2]][[1]]
#> [[1]][[2]][[1]][[1]]
#> [[1]][[2]][[1]][[1]][[1]]
#> [1] 1 2
#> 
#> 
#> [[1]][[2]][[1]][[2]]
#> [1] 3 4

Upvotes: 0

St&#233;phane Laurent
St&#233;phane Laurent

Reputation: 84529

I'm using this function:

removeNULL <- function(x){
    x <- Filter(Negate(is.null), x)
    if( is.list(x) ){
      x <- lapply( x, function(y) Filter(length, removeNULL(y)))
    }
    return(x)
}

Not only it remove the NULL elements, but it also removes the elements which are a list containing only NULL elements, such as A2$A2$format$font in the example below:

> A2
$A2
$A2$value
[1] 9.9

$A2$format
$A2$format$numberFormat
[1] "2Decimal"

$A2$format$font
$A2$format$font$name
NULL

$A2$format$font$bold
NULL

$A2$format$font$color
NULL



$A2$comment
NULL


> removeNULL(A2)
$A2
$A2$value
[1] 9.9

$A2$format
$A2$format$numberFormat
[1] "2Decimal"

Upvotes: 1

Beasterfield
Beasterfield

Reputation: 7113

Typically, you remove NULL elements on a flat list with

ll <- list( 1, 2, NULL, 3 )
ll <- ll[ ! sapply(ll, is.null) ]

If you do not know the structure in advance, this is a clear case to combine this solution with a recursive function:

removeNullRec <- function( x ){  
  x <- x[ !sapply( x, is.null ) ]
  if( is.list(x) ){
    x <- lapply( x, removeNullRec)
  }
  return(x)
}

removeNullRec(tmp)

[[1]]
[[1]][[1]]
[[1]][[1]][[1]]
[1]  2  9 10


[[1]][[2]]
[1] 1 3 4 6


[[2]]
[1] 7

Edit

It's always good to rephrase the problem as simple as possible. What I understood from your comments is, that (independent of the occurrence of NULL elements) you want to replace each element which contains only one child by the child itself. There is also another case which has to be considered then: Two sibling leafs could be NULL as well. So lets start with a little bit more complicated example:

enter image description here

tree <- list(
  list(
    list(
      list(
        list( NULL, NULL ),
        list( NULL, NULL )
      ),
      7
    ),
    list(
      list(
        list( c(1,2), NULL ),
        c(3,4)
))))

This isolated problem to flat the tree is of course also solved best by applying recursive approach:

flatTreeRec <- function( x ){
  if( is.list(x) ){
    # recursion
    x <- lapply( x, flatTree )
    # remove empty branches
    x <- x[ sapply( x, length ) > 0 ]
    # flat branches with only child
    if( length(x) == 1 ){
      x <- x[[1]]
    }
  }
  return(x)
}

flatTreeRec( removeNullRec(tree) )

And of course you can directly combine this two functions to avoid stressing your stack twice:

removeNullAndFlatTreeRec <- function( x ){  
  x <- x[ !sapply( x, is.null ) ]
  if( is.list(x) ){
    x <- lapply( x, removeNullRec)
    x <- x[ sapply( x, length ) > 0 ]
    if( length(x) == 1 ){
      x <- x[[1]]
    }
  }
  return(x)
}

removeNullAndFlatTreeRec( tree )

Upvotes: 16

Related Questions