Reputation: 41
How can I convert a quite complicated nested lists with data frames inside to one consolidated data frame?
Here are first 3 records in my raw input in JSON.
{
"abc": [[{"variant": "enabled", "allocation": 100}], [{"variant": "control", "allocation": 100}]],
"def": [[{"variant": "enable", "allocation": 100}]],
"hahaha": [[{"variant": "superman", "allocation": 5}, {"variant": "superhero", "allocation": 95}]]
}
Then I loaded this JSON file into R.
library(jsonlite)
myList <- fromJSON("myjsonfile.json")
str(myList)
List of 8988
$ abc :List of 2
..$ :'data.frame': 1 obs. of 2 variables:
.. ..$ variant : chr "enabled"
.. ..$ allocation: int 100
..$ :'data.frame': 1 obs. of 2 variables:
.. ..$ variant : chr "control"
.. ..$ allocation: int 100
$ def :List of 1
..$ :'data.frame': 1 obs. of 2 variables:
.. ..$ variant : chr "enable"
.. ..$ allocation: int 100
$ hahaha :List of 1
..$ :'data.frame': 2 obs. of 2 variables:
.. ..$ variant : chr [1:2] "superman" "superhero"
.. ..$ allocation: int [1:2] 5 95
As you can see in each list, there could be different number of data frames and each data frame may contain different number of obs.
Ideally I want to get one dataframe as below:
test_name, segments, variant, allocation
abc, 1, enabled, 100
abc, 2, control, 100
def, 1, enable, 100,
hahaha, 1, superman, 5
hahaha, 1, superhero, 95
What is a scalable approach for all 8988 records here? Appreciate your helps here.
Upvotes: 1
Views: 68
Reputation: 6244
Here is an approach that:
rrapply()
(in the rrapply
-package).pivot_wider()
and unnest()
.library(rrapply)
library(tidyverse)
## melt to data.frame
mydf <- rrapply(myList, how = "melt")
## reshape data.frame
mydf_reshaped <- pivot_wider(mydf, names_from = "L3") %>%
unnest(c(variant, allocation)) %>%
rename(test_name = L1, segments = L2)
mydf_reshaped
#> # A tibble: 5 x 4
#> test_name segments variant allocation
#> <chr> <chr> <chr> <int>
#> 1 abc ..1 enabled 100
#> 2 abc ..2 control 100
#> 3 def ..1 enable 100
#> 4 hahaha ..1 superman 5
#> 5 hahaha ..1 superhero 95
This should directly generalize to the complete json-file as well.
Data
myList <- list(abc = list(structure(list(variant = "enabled", allocation = 100L), class = "data.frame", row.names = 1L),
structure(list(variant = "control", allocation = 100L), class = "data.frame", row.names = 1L)),
def = list(structure(list(variant = "enable", allocation = 100L), class = "data.frame", row.names = 1L)),
hahaha = list(structure(list(variant = c("superman", "superhero"
), allocation = c(5L, 95L)), class = "data.frame", row.names = 1:2)))
Upvotes: 1
Reputation: 2677
Easiest way is to use dplyr bind_rows()
.
library(dplyr)
df_list <- list(iris[1:5,], iris[6:10,])
bind_rows(df_list)
Upvotes: 0