MayaGans
MayaGans

Reputation: 1845

flatten list column within dataframe in R

I have a data frame with two columns containing lists.

test <- data.frame(var_1 = c("ONE", "TWO"))

test$var_2 <- c(list(data.frame(Date = c("Date 1", "Date 2")), list(data.frame(Date = c("Date 3", "Date 4", "Date 5")))))                   
test$var_3 <- c(list(data.frame(Name = c("Name 1", "Name 2")), list(data.frame(Date = c("Name 3", "Name 4", "Name 5")))))                   

I'd like to loop over the lists to create a flattened data frame, repeating var_1 for the length of the list of each row

Desired Output

data.frame(var_1 = c("ONE", "ONE", "TWO", "TWO", "TWO"),
           Date = c("Date 1", "Date 2", "Date 3", "Date 4", "Date 5"),
           Name = c("Name 1", "Name 2", "Name 3", "Name 4", "Name 5"))


  var_1   Date   Name
1   ONE Date 1 Name 1
2   ONE Date 2 Name 2
3   TWO Date 3 Name 3
4   TWO Date 4 Name 4
5   TWO Date 5 Name 5

I tried applying some of the proposed solutions like using unnest from this post Flatten list column in data frame with ID column but this didn't quite give me my desired result.

Upvotes: 2

Views: 1567

Answers (2)

chinsoon12
chinsoon12

Reputation: 25225

An option using data.table:

setDT(test)[, c(.(var_1=var_1), do.call(cbind.data.frame, list(var_2, var_3))), var_1][, 
    (1L) := NULL][]

output:

   var_1   Date   Name
1:   ONE Date 1 Name 1
2:   ONE Date 2 Name 2
3:   TWO Date 3 Name 3
4:   TWO Date 4 Name 4
5:   TWO Date 5 Name 5

data:

test <- data.frame(var_1 = c("ONE", "TWO"))

test$var_2 <- c(list(data.frame(Date = c("Date 1", "Date 2")), list(data.frame(Date = c("Date 3", "Date 4", "Date 5")))))                   
test$var_3 <- c(list(data.frame(Name = c("Name 1", "Name 2")), list(data.frame(Name = c("Name 3", "Name 4", "Name 5")))))                   

Upvotes: 1

akrun
akrun

Reputation: 887088

Here is one option where we unlist the 'var_2', 'var_3', and unnest

library(dplyr)
library(purrr)
library(tidyr)
test %>% 
    group_split(var_1) %>%
    map_dfr(~ .x %>% 
                mutate_at(-1, ~ list(unlist(.))) %>% 
                unnest(c(var_2, var_3)))
# A tibble: 5 x 3
#  var_1 var_2  var_3 
#  <fct> <fct>  <fct> 
#1 ONE   Date 1 Name 1
#2 ONE   Date 2 Name 2
#3 TWO   Date 3 Name 3
#4 TWO   Date 4 Name 4
#5 TWO   Date 5 Name 5

Or we can do

test %>%
     rowwise %>%
     summarise_all(~ list(unlist(.))) %>%
     unnest(cols = everything())
# A tibble: 5 x 3
#  var_1 var_2  var_3 
#  <fct> <fct>  <fct> 
#1 ONE   Date 1 Name 1
#2 ONE   Date 2 Name 2
#3 TWO   Date 3 Name 3
#4 TWO   Date 4 Name 4
#5 TWO   Date 5 Name 5

Or with

test  %>% 
    group_by(var_1) %>%
    nest %>% 
    mutate(data = map(data, ~ summarise_all(.x, ~ list(unlist(.))) %>% 
    unnest(everything())))   %>% 
    unnest(data)

Upvotes: 3

Related Questions