Cleland
Cleland

Reputation: 359

Transforming a list of lists into dataframe

I have a list containing a number of other lists, each of which contain varying numbers of character vectors, with varying numbers of elements. I want to create a dataframe where each list would be represented as a row and each character vector within that list would be a column. Where the character vector has > 1 element, the elements would be concatenated and separated using a "+" sign, so that they can be stored as one string. The data looks like this:

fruits <- list(
  list(c("orange"), c("pear")),
  list(c("pear", "orange")),
  list(c("lemon", "apple"),
       c("pear"),
       c("grape"),
       c("apple"))
)

The expected output is like this:

fruits_df <- data.frame(col1 = c("orange", "pear + orange", "lemon + apple"),
           col2 = c("pear", NA, "pear"), 
           col3 = c(NA, NA, "grape"),
           col4 = c(NA, NA, "apple"))

There is no limit on the number of character vectors that can be contained in a list, so the solution needs to dynamically create columns, leading to a df where the number of columns is equal to the length of the list containing the largest number of character vectors.

Upvotes: 1

Views: 126

Answers (3)

Joris C.
Joris C.

Reputation: 6234

Another approach that melts the list to a data.frame using rrapply::rrapply and then casts it to the required format using data.table::dcast:

library(rrapply)
library(data.table)

## melt to long data.frame
long <- rrapply(fruits, f = paste, how = "melt", collapse = " + ")

## cast to wide data.table
setDT(long)
dcast(long[, .(L1, L2, value = unlist(value))], L1 ~ L2)[, !"L1"]
#>              ..1  ..2   ..3   ..4
#> 1:        orange pear  <NA>  <NA>
#> 2: pear + orange <NA>  <NA>  <NA>
#> 3: lemon + apple pear grape apple

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388907

For every list in fruits you can create a one row dataframe and bind the data.

dplyr::bind_rows(lapply(fruits, function(x) as.data.frame(t(sapply(x, 
                 function(y) paste0(y, collapse = "+"))))))

#           V1   V2    V3    V4
#1      orange pear  <NA>  <NA>
#2 pear+orange <NA>  <NA>  <NA>
#3 lemon+apple pear grape apple

Upvotes: 2

MrFlick
MrFlick

Reputation: 206197

This is a bit messy but here is one way

cols <-  lapply(fruits, function(x) sapply(x, paste, collapse=" + "))
ncols <- max(lengths(cols))
dd <- do.call("rbind.data.frame", lapply(cols, function(x) {length(x) <- ncols; x}))
names(dd)  <- paste0("col", 1:ncol(dd))
dd

#            col1 col2  col3  col4
# 1        orange pear  <NA>  <NA>
# 2 pear + orange <NA>  <NA>  <NA>
# 3 lemon + apple pear grape apple

or another strategy

ncols <- max(lengths(fruits))
dd <- data.frame(lapply(seq.int(ncols), function(x) sapply(fruits, function(y) paste(unlist(y[x]), collapse=" + "))))
names(dd)  <- paste0("col", 1:ncols)
dd

But really you need to either build each column or row from your list and then combine them together.

Upvotes: 2

Related Questions