user113156
user113156

Reputation: 7127

Using purrr to skip over errors encountered due to empty lists in data

I have two types of lists one populated and another unpopulated.

I can run the following code which processes and cleans the lists up for me.

library(purrr)
library(tidyverse)
list1 %>%
  map(., ~unlist(.x) %>% 
        data.frame() %>% 
        rownames_to_column("tag") %>% 
        setNames(c("tag", "info")) %>% # renames the columns
        pivot_wider(names_from = "tag", values_from = "info")) %>% 
  setNames(c(paste("skills", seq_along(1:length(.)), sep = "_"))) %>%  # renames the list
  bind_rows()

Which gives:

# A tibble: 6 x 2
  name   endorsements
  <chr>  <chr>       
1 skill1 9           
2 skill2 8           
3 skill3 6           
4 skill4 5           
5 skill5 4           
6 skill6 3  

However, some of the lists contain nothing. When I try to run the following code:

list2 %>%
  map(., ~unlist(.x) %>% 
        data.frame() %>% 
        rownames_to_column("tag") %>% 
        setNames(c("tag", "info")) %>% # renames the columns
        pivot_wider(names_from = "tag", values_from = "info")) %>% 
  setNames(c(paste("skills", seq_along(1:length(.)), sep = "_"))) %>%  # renames the list
  bind_rows()

I get this error:

Error in names(object) <- nm : 'names' attribute [2] must be the same length as the vector [0]

The error comes from the line setNames(c(paste("skills", seq_along(1:length(.)), sep = "_"))) since it is trying to rename an empty list. I have tried wrapping one of purrr's safely, quietly and possibly functions around it without luck.

How can I "skip" over the results which return an error or have empty lists? I still want something returned, since even if the list is empty, it's list location (i.e. list[[762]]) corresponds to another list in a separate object located at list[[762]] - so removing empty lists is not a suitable option.

List 1: (populated)

list1 <- list(c(name = "skill1", endorsements = "9"), c(name = "skill2", 
                                                        endorsements = "8"), c(name = "skill3", 
                                                                               endorsements = "6"), c(name = "skill4", endorsements = "5"), 
              c(name = "skill5", endorsements = "4"), c(name = "skill6", 
                                                        endorsements = "3"))

List 2: (unpopulated)

list2 <- list()

Upvotes: 1

Views: 567

Answers (1)

Oliver
Oliver

Reputation: 8582

Change setNames to vars_rename from the tidyselect package, which allows you to skip missing columns by setting strict = FALSE.

library(purrr)
library(dplyr)
library(tidyselect)
list2 %>%
  map(., ~unlist(.x) %>% 
        data.frame() %>% 
        rownames_to_column("tag") %>% 
        vars_rename(c("tag", "info"), strict = FALSE) %>% # renames the columns
        pivot_wider(names_from = "tag", values_from = "info"))
[1] list()

This will still fail at your later stage however, but this is due to how you are using paste0 however

list2 %>% 
    vars_rename(c(paste('skills', seq_along(.), sep = '_')), strict = FALSE)

Error: vars must be a character vector
Run rlang::last_error() to see where the error occurred.

The problem is simply that paste('skills', seq_along(.), sep = '_') will return list() to vars_rename which requires the first input to be a character vector. Some error handling seems to be in order, or simply testing if partial result is an empty list.

Upvotes: 2

Related Questions