yuckeye
yuckeye

Reputation: 41

How to convert nested lists with data frames inside to one data frame in R

How can I convert a quite complicated nested lists with data frames inside to one consolidated data frame?

Here are first 3 records in my raw input in JSON.

{
    "abc": [[{"variant": "enabled", "allocation": 100}], [{"variant": "control", "allocation": 100}]],
    "def": [[{"variant": "enable", "allocation": 100}]], 
    "hahaha": [[{"variant": "superman", "allocation": 5}, {"variant": "superhero", "allocation": 95}]]
}

Then I loaded this JSON file into R.

library(jsonlite)
myList <- fromJSON("myjsonfile.json")

str(myList)

List of 8988
 $  abc                                                                        :List of 2
  ..$ :'data.frame':    1 obs. of  2 variables:
  .. ..$ variant   : chr "enabled"
  .. ..$ allocation: int 100
  ..$ :'data.frame':    1 obs. of  2 variables:
  .. ..$ variant   : chr "control"
  .. ..$ allocation: int 100
$ def                                                                          :List of 1
  ..$ :'data.frame':    1 obs. of  2 variables:
  .. ..$ variant   : chr "enable"
  .. ..$ allocation: int 100
$ hahaha                                                                       :List of 1
  ..$ :'data.frame':    2 obs. of  2 variables:
  .. ..$ variant   : chr [1:2] "superman" "superhero"
  .. ..$ allocation: int [1:2] 5 95

As you can see in each list, there could be different number of data frames and each data frame may contain different number of obs.

Ideally I want to get one dataframe as below:

test_name, segments, variant, allocation
abc, 1, enabled, 100
abc, 2, control, 100
def, 1, enable, 100,
hahaha, 1, superman, 5
hahaha, 1, superhero, 95

What is a scalable approach for all 8988 records here? Appreciate your helps here.

Upvotes: 1

Views: 68

Answers (2)

Joris C.
Joris C.

Reputation: 6244

Here is an approach that:

  1. Melts the nested list to a data.frame with rrapply() (in the rrapply-package).
  2. Reshapes the data.frame using tidyr's pivot_wider() and unnest().
library(rrapply)
library(tidyverse)

## melt to data.frame
mydf <- rrapply(myList, how = "melt")

## reshape data.frame
mydf_reshaped <- pivot_wider(mydf, names_from = "L3") %>%
  unnest(c(variant, allocation)) %>%
  rename(test_name = L1, segments = L2)

mydf_reshaped
#> # A tibble: 5 x 4
#>   test_name segments variant   allocation
#>   <chr>     <chr>    <chr>          <int>
#> 1 abc       ..1      enabled          100
#> 2 abc       ..2      control          100
#> 3 def       ..1      enable           100
#> 4 hahaha    ..1      superman           5
#> 5 hahaha    ..1      superhero         95

This should directly generalize to the complete json-file as well.


Data

myList <- list(abc = list(structure(list(variant = "enabled", allocation = 100L), class = "data.frame", row.names = 1L), 
                          structure(list(variant = "control", allocation = 100L), class = "data.frame", row.names = 1L)), 
               def = list(structure(list(variant = "enable", allocation = 100L), class = "data.frame", row.names = 1L)), 
               hahaha = list(structure(list(variant = c("superman", "superhero"
               ), allocation = c(5L, 95L)), class = "data.frame", row.names = 1:2)))

Upvotes: 1

David
David

Reputation: 2677

Easiest way is to use dplyr bind_rows().

library(dplyr)

df_list <- list(iris[1:5,], iris[6:10,])

bind_rows(df_list)

Upvotes: 0

Related Questions