Francisco
Francisco

Reputation: 161

Search unnamed lists of lists for string

I have a list of unnamed lists of named dataframes of varying lengths. I'm looking for a way to grep or search through the indices of the list elements to find specific named dfs.

Here is the current method:

library(tibble) # for tibbles 
## list of lists of dataframes 
abc_list <- list(list(dfAAA = tibble(names = state.abb[1:10]),
                      dfBBB = tibble(junk = state.area[5:15]),
                      dfAAA2 = tibble(names = state.abb[8:20])),
                 list(dfAAA2 = tibble(names = state.abb[10:15]),
                      dfCCC  = tibble(junk2 = state.area[4:8]),
                      dfGGG  = tibble(junk3 = state.area[12:14])))

# Open list, manually ID list index which has "AAA" dfs
# extract from list of lists into separate list 
desired_dfs_list <- abc_list[[1]][grepl("AAA", names(abc_list[[1]]))]

# unlist that list into a combined df
desired_rbinded_list <- as.data.frame(data.table::rbindlist(desired_dfs_list, use.names = F))

I know there's a better way than this.

What I've attempted so far:

## attempt:
## find pattern in df names 
aaa_indices <- sapply(abc_list, function(x) grep(pattern = "AAA", names(x)))

## apply that to rbind ??? 
desired_aaa_rbinded_list <- purrr::map_df(aaa_indices, data.table::rbindlist(abc_list))

the steps from the manual example would be:

  1. pull identified list items (dfs) into a separate list
  2. rbind the list of dfs into one df

I'm just not sure how to do that in a way that allows me more flexibility, instead of manually opening the lists and ID-ing the indices to pull.

thanks for any help or ideas!

Upvotes: 2

Views: 41

Answers (1)

Sandwichnick
Sandwichnick

Reputation: 1466

If your tibbles( or dataframes) are always one level deep in the list (meaning a list(0.level) of lists (1st level)) you can use unlist to get rid of the first level:

all_dfs_list <- unlist(abc_list,
       recursive = FALSE # will stop unlisting after the first level
       )

This will result in a list of tibbles:

> all_dfs_list
$dfAAA
# A tibble: 10 x 1
   names
   <chr>
 1 AL   
 2 AK   
...

then you can filter by name and use rbindlist on the desired elements, as you already did in your question:

desired_dfs_list <- all_dfs_list[grepl("AAA",names(all_dfs_list))]

desired_rbinded_list <- as.data.frame(
  data.table(rbindlist(desired_dfs_list, use.names = F))
  )

Upvotes: 1

Related Questions