Reputation: 9793
mylist <- list(NULL, structure(list(Gender = structure(1L, .Label = "Female", class = "factor"),
ID = structure(1L, .Label = "1", class = "factor"), Class = structure(1L, .Label = "A", class = "factor"),
Score1 = 21.6, Score2 = 39.61, Score3 = 8.85,
Score4 = 13.66, Score5 = 2.64999999999999, Score6 = 6.94736842105265), .Names = c("Gender",
"ID", "Class", "Score1", "Score2", "Score3",
"Score4", "Score5", "Score6"), row.names = c(NA, -1L
), class = "data.frame"), list(structure(list(Gender = structure(1:2, .Label = c("Female",
"Male"), class = "factor"), ID = structure(c(1L, 1L), .Label = "2", class = "factor"),
Class = structure(c(1L, 1L), .Label = "A", class = "factor"),
Score1 = c(25.58, 18.31), Score2 = c(55.01,
36.28), Score3 = c(1.66, 2.13), Score4 = c(3.6,
4.24), Score5 = c(15.2727272727273, 8.57142857142858), Score6 = c(15.5833333333334,
8.4545454545455)), .Names = c("Gender", "ID", "Class",
"Score1", "Score2", "Score3", "Score4",
"Score5", "Score6"), row.names = c(NA, -2L), class = "data.frame"),
structure(list(Gender = structure(1:2, .Label = c("Female",
"Male"), class = "factor"), ID = structure(c(1L, 1L), .Label = "3", class = "factor"),
Class = structure(c(1L, 1L), .Label = "A", class = "factor"),
Score1 = c(27.16, 14.67), Score2 = c(58.39,
29.07), Score3 = c(1.66, 2.13), Score4 = c(3.6,
4.24), Score5 = c(16.2727272727273, 6.85714285714286),
Score6 = c(16.5833333333333, 6.81818181818185)), .Names = c("Gender",
"ID", "Class", "Score1", "Score2",
"Score3", "Score4", "Score5", "Score6"), row.names = c(NA,-2L), class = "data.frame")))
I have a list that looks like:
> mylist
[[1]]
NULL
[[2]]
Gender ID Class Score1 Score2 Score3 Score4 Score5 Score6
1 Female 1 A 21.6 39.61 8.85 13.66 2.65 6.947368
[[3]]
[[3]][[1]]
Gender ID Class Score1 Score2 Score3 Score4 Score5 Score6
1 Female 2 A 25.58 55.01 1.66 3.60 15.272727 15.583333
2 Male 2 A 18.31 36.28 2.13 4.24 8.571429 8.454545
[[3]][[2]]
Gender ID Class Score1 Score2 Score3 Score4 Score5 Score6
1 Female 3 A 27.16 58.39 1.66 3.60 16.272727 16.583333
2 Male 3 A 14.67 29.07 2.13 4.24 6.857143 6.818182
Where some elements are NULL
, and other elements can have multiple sub elements, e.g. the third element has sub elements [[1]]
and [[2]]
.
I would like to combine these list elements into a single data.frame that looks something like this (I'm omitting the contents of columns Score2 through Score6 for convenience):
Gender ID Class Score1 Score2 ... Score6
1 Female 1 A 21.60
2 Female 2 A 25.58
3 Male 2 A 18.31
4 Female 3 A 27.16
5 Male 3 A 14.67
I've tried the following but got errors
> tab <- unlist(mylist, recursive = FALSE)
> df <- do.call("rbind", tab)
Warning in `[<-.factor`(`*tmp*`, ri, value = 1L) :
invalid factor level, NA generated
Warning in `[<-.factor`(`*tmp*`, ri, value = 1L) :
invalid factor level, NA generated
...
Using ldply
does not handle the last element correctly
> ldply(mylist, data.frame)
Gender ID Class Score1 Score2 Score3 Score4 Score5 Score6 Gender.1 ID.1 Class.1 Score1.1 Score2.1 Score3.1 Score4.1 Score5.1 Score6.1
1 Female 1 A 21.60 39.61 8.85 13.66 2.650000 6.947368 <NA> <NA> <NA> NA NA NA NA NA NA
2 Female 2 A 25.58 55.01 1.66 3.60 15.272727 15.583333 Female 3 A 27.16 58.39 1.66 3.60 16.272727 16.583333
3 Male 2 A 18.31 36.28 2.13 4.24 8.571429 8.454545 Male 3 A 14.67 29.07 2.13 4.24 6.857143 6.818182
Upvotes: 2
Views: 2132
Reputation: 47300
I propose a generalized unlist
function called unlist_unless
which works like unlist
with the following differences:
predicate
argument used to leave some sub element untouched (formula notation is supported if purrr
is installed)...
passes arguments to predicate
keep_null
is used to keep (default) or drop NULL
elementsLike unlist
it features parameters recursive
and use.names
with same defaults.parameter set to TRUE
by default, it also features a keep_null
parameter that I've set to TRUE
by default.
unlist_unless <- function(x, predicate = function(x) FALSE, ..., recursive = TRUE, use.names = TRUE, keep_null = TRUE){
if(inherits(predicate, "formula")) {
if (requireNamespace("purrr")) predicate <- purrr::as_mapper(predicate) else
stop("Package `purrr` needs to be installed to use formula notation")
}
unlist(lapply(x, function(y){
if(predicate(y, ...) || (keep_null && is.null(y)))
list(y)
else if (is.list(y) && recursive)
unlist_unless(y, predicate = predicate, ..., keep_null=keep_null, use.names = use.names)
else y}),
recursive = FALSE,
use.names = use.names)
}
examples1 : simplest
df <- head(iris)[1:3]
dfs<- list(df[1,],
NULL,
list(df[2,],
df[3,],
list(df[4,]),
NULL))
unlist_unless(dfs, is.data.frame)
# [[1]]
# Sepal.Length Sepal.Width Petal.Length
# 1 5.1 3.5 1.4
#
# [[2]]
# NULL
#
# [[3]]
# Sepal.Length Sepal.Width Petal.Length
# 2 4.9 3 1.4
#
# [[4]]
# Sepal.Length Sepal.Width Petal.Length
# 3 4.7 3.2 1.3
#
# [[5]]
# Sepal.Length Sepal.Width Petal.Length
# 4 4.6 3.1 1.5
#
# [[6]]
# NULL
examples2 : keep_null = FALSE
unlist_unless(dfs, is.data.frame, keep_null = FALSE)
# [[1]]
# Sepal.Length Sepal.Width Petal.Length
# 1 5.1 3.5 1.4
#
# [[2]]
# Sepal.Length Sepal.Width Petal.Length
# 2 4.9 3 1.4
#
# [[3]]
# Sepal.Length Sepal.Width Petal.Length
# 3 4.7 3.2 1.3
#
# [[4]]
# Sepal.Length Sepal.Width Petal.Length
# 4 4.6 3.1 1.5
unlist_unless(dfs, is.data.frame, recursive = FALSE)
# [[1]]
# Sepal.Length Sepal.Width Petal.Length
# 1 5.1 3.5 1.4
#
# [[2]]
# NULL
#
# [[3]]
# Sepal.Length Sepal.Width Petal.Length
# 2 4.9 3 1.4
#
# [[4]]
# Sepal.Length Sepal.Width Petal.Length
# 3 4.7 3.2 1.3
#
# [[5]]
# [[5]][[1]]
# Sepal.Length Sepal.Width Petal.Length
# 4 4.6 3.1 1.5
#
#
# [[6]]
# NULL
Then it's straightforward to call bind_rows(dfs_new)
or do.call(rbind, dfs_new)
on the result.
do.call(rbind,unlist_unless(dfs, is.data.frame))
# Sepal.Length Sepal.Width Petal.Length
# 1 5.1 3.5 1.4
# 2 4.9 3.0 1.4
# 3 4.7 3.2 1.3
# 4 4.6 3.1 1.5
# or
library(dplyr)
unlist_unless(dfs, is.data.frame) %>% bind_rows
# Sepal.Length Sepal.Width Petal.Length
# 1 5.1 3.5 1.4
# 2 4.9 3.0 1.4
# 3 4.7 3.2 1.3
# 4 4.6 3.1 1.5
Upvotes: 1
Reputation: 16832
You can do this very simply with a couple tidyverse
functions. purrr::reduce
lets you apply a function across a list or vector, and dplyr::bind_rows
is like an expanded, smarter rbind
.
Note that, like I said in my comment, you get warnings about the fact that you're binding character vectors with factor vectors, but this is simply a warning, not an error.
purrr::reduce(mylist, dplyr::bind_rows)
#> Warning in bind_rows_(x, .id): binding character and factor vector,
#> coercing into character vector
...
#> Gender ID Class Score1 Score2 Score3 Score4 Score5 Score6
#> 1 Female 1 A 21.60 39.61 8.85 13.66 2.650000 6.947368
#> 2 Female 2 A 25.58 55.01 1.66 3.60 15.272727 15.583333
#> 3 Male 2 A 18.31 36.28 2.13 4.24 8.571429 8.454545
#> 4 Female 3 A 27.16 58.39 1.66 3.60 16.272727 16.583333
#> 5 Male 3 A 14.67 29.07 2.13 4.24 6.857143 6.818182
Upvotes: 6
Reputation: 2047
Try this:
ll <- unlist(lapply(mylist, function(x) if(is.data.frame(x)) list(x) else x), recursive = FALSE)
do.call(rbind, ll)
Gender ID Class Score1 Score2 Score3 Score4 Score5 Score6
1 Female 1 A 21.60 39.61 8.85 13.66 2.650000 6.947368
2 Female 2 A 25.58 55.01 1.66 3.60 15.272727 15.583333
3 Male 2 A 18.31 36.28 2.13 4.24 8.571429 8.454545
4 Female 3 A 27.16 58.39 1.66 3.60 16.272727 16.583333
5 Male 3 A 14.67 29.07 2.13 4.24 6.857143 6.818182
Upvotes: 2