R - Is it possible to unnest a list-column that contains missing (NA) values?

Question

The tibble below has a list-column property that contains some missing values:

library(tidyverse)

tbl = tibble(type = c('scale', 'range', 'min', 'max'), 
         property = list(list(lttr = letters, mth = month.name), NA) %>% 
           rep(., 2))
# A tibble: 4 x 2
  type  property  
       
1 scale 
2 range  
3 min   
4 max

I would like to unnest this column and then spread the result into a wide format with three columns - type, lttr and mth:

tbl = tibble(type = c('scale', 'range', 'min', 'max'), 
             property = list(list(lttr = letters, mth = month.name), NA) %>% 
               rep(., 2)) %>% 
  mutate(property = map_if(property, is_list, enframe)) %>% 
  unnest(property) %>%
  spread(name, value)

However, the unnest call throws the following error:

Error: Each column must either be a list of vectors or a list of data frames [property]

I came across a similar issue on Git that asks unnest to support NULL values but makes no mention of NAs. There don't appear to be any arguments in the function documentation that pertain to missings either, but I could be wrong.

The pipeline works as expected if the NAs are filtered out:

tbl = tibble(type = c('scale', 'range', 'min', 'max'), 
             property = list(list(lttr = letters, mth = month.name), NA) %>% 
               rep(., 2)) %>% 
  mutate(property = map_if(property, is_list, enframe)) %>% 
  filter(!is.na(property)) %>% # drop_na() and na_omit not working not sure why
  unnest(property) %>%
  spread(name, value)

tbl
# A tibble: 2 x 3
  type  lttr       mth       
            
1 min    
2 scale

akrun · Accepted Answer

An option would be to convert everything into tibble so that while unnesting the structure would be the same across rather than manually subsetting

library(tidyverse)
tbl %>%
    mutate(property = map(property, ~ if(!is.list(.x))
        enframe(list(nm1 = .x)) else enframe(.x))) %>%
    unnest %>% 
    spread(name, value) %>%
    select(type, lttr, mth)
# A tibble: 4 x 3
#  type  lttr       mth       
#            
#1 max            
#2 min    
#3 range          
#4 scale

The issue in the OP's example is that difference in structure for the NA rows when compared to the other rows. When we filter them out, the structure is same across and the issue got resolved

We can also check with another example where the number of list elements are greater than 2.

tbl1 <- tibble(type = c('scale', 'range', 'min', 'max'), 
      property = list(list(lttr = letters, mth = month.name, 
       val1 = rnorm(12), val2 = runif(12)), NA) %>% 
        rep(., 2))

tbl1 %>% 
   mutate(property = map(property, ~ if(!is.list(.x)) enframe(list(nm1 = .x)) 
          else enframe(.x))) %>% 
   unnest %>%
   spread(name, value) %>%
   select(-nm1)
# A tibble: 4 x 5
#  type  lttr       mth        val1       val2      
#                      
#1 max                      
#2 min      
#3 range                    
#4 scale

This can be extended to arbitrary number of elements

R - Is it possible to unnest a list-column that contains missing (NA) values?

Answers (2)

Related Questions