Dom
Dom

Reputation: 1053

purrr::map() a deeply nested list to test for equality of dataframes

Problem

I have a list that has sets of nested lists. I need to test whether all of the dataframes in the lowest level are equal and I need to respect the grouping of the data while I do this test.

I am trying to solve the problem using purrr::map() but I am having real trouble understanding how I can iterate over each sub-list.

I have used gapminder in this example only because it can be nested twice, which is the same as my actual data (which I can't share here).

The data

library(dplyr)
library(gapminder)
library(purrr)

tf <- gapminder %>% 
  select(continent, country, year) %>% 
  group_by(continent, year) %>% 
  nest() %>% 
  arrange(desc(year)) %>% 
  ungroup() %>% 
  group_by(year) %>% 
  nest()

My attempt

tf$data[[1]] contains a list of data on each continent. It is these lists that I need to check for equality. This dataset produces unequal lists at this level but it doesn't matter, I just need the pattern for my actual data.

My attempt only allows me to iterate through one list the bottom level.

map_chr(tf$data[[1]]$data, all_equal, current = tf$data[[1]]$data[[1]])

I need to do this over all of the lists at the bottom level: for each year in tf, for each list in tf$data, for each continent in tf$data[[1]], for each list in tf$data[[1]]$data, compare whether the first list tf$data[[1]]$data[[1]] is equal to the other lists at that level.

Upvotes: 0

Views: 202

Answers (1)

shizundeiku
shizundeiku

Reputation: 320

Why not unnest the list one level? Then you can use all dplyr has to offer, like group-wise mutate:

tf %>%
  unnest(data) %>%
  mutate(equal_to_first = map_chr(data, all_equal, current = data[[1]])) %>%
  unnest(equal_to_first)

Result:

# A tibble: 60 x 4
# Groups:   year [12]
    year continent data              equal_to_first          
   <int> <fct>     <list>            <chr>                   
 1  2007 Asia      <tibble [33 × 1]> TRUE                    
 2  2007 Europe    <tibble [30 × 1]> Different number of rows
 3  2007 Africa    <tibble [52 × 1]> Different number of rows
 4  2007 Americas  <tibble [25 × 1]> Different number of rows
 5  2007 Oceania   <tibble [2 × 1]>  Different number of rows
 6  2002 Asia      <tibble [33 × 1]> TRUE                    
 7  2002 Europe    <tibble [30 × 1]> Different number of rows
 8  2002 Africa    <tibble [52 × 1]> Different number of rows
 9  2002 Americas  <tibble [25 × 1]> Different number of rows
10  2002 Oceania   <tibble [2 × 1]>  Different number of rows
# … with 50 more rows

If you would like to get your original structure back, you can simply nest the result again.

Upvotes: 1

Related Questions