Reputation: 1535
Is there a more tidyverse
-idiomatic way to combine several columns into a list column than using mapply
?
For example given the following
tibble(.rows = 9) %>%
mutate(foo = runif(n()),
a_1 = runif(n()),
a_2 = runif(n()),
a_3 = runif(n())) ->
Z
(where Z
might contain other columns, and might also contain more than 3 a
s) one can do
Z %>% mutate(A = mapply(c, a_1, a_2, a_3, SIMPLIFY = FALSE))
which works fine, although it would be nice to be able to say starts_with('a_')
instead of a_1, a_2, a_3
.
Another possibility is
Z %>%
rowid_to_column() %>%
pivot_longer(cols = starts_with('a_')) %>%
group_by(rowid) %>%
summarise(foo = unique(foo),
A = list(value)) %>%
select(-rowid)
which technically works, but introduces other problems (e.g., it uses an ugly foo = unique(foo)
; furthermore if instead of just one foo
there were many foo
s it would become a bit more involved).
Upvotes: 4
Views: 2531
Reputation: 576
Based on a previous answer (now deleted) and the comments, I made a comparison of different solutions:
FUN_mapply <- function() { Z %>% mutate(A = mapply(c, a_1, a_2, a_3, SIMPLIFY = FALSE)) }
FUN_asplit <- function() { Z %>% mutate(A = asplit(.[,grepl("^a", colnames(.))], 1)) }
FUN_pmap <- function() { Z %>% mutate(A = pmap(.[,grepl("^a", colnames(.))], c)) }
FUN_transpose <- function() { Z %>% mutate(A = transpose(.[,grepl("^a", colnames(.))])) }
FUN_asplit_tidy <- function() { Z %>% mutate(A = asplit(select(., starts_with("a")), 1)) }
FUN_pmap_tidy <- function() { Z %>% mutate(A = pmap(select(., starts_with("a")), c)) }
FUN_transpose_tidy <- function() { Z %>% mutate(A = transpose(select(., starts_with("a")))) }
all(unlist(pmap(list(FUN_mapply()$A, FUN_asplit()$A, FUN_pmap()$A, FUN_transpose()$A), ~all(mapply(all.equal, .x, .y, MoreArgs = list(attributes = F)))))) # All A columns are equal?
mb <- microbenchmark::microbenchmark(
FUN_mapply(),
FUN_asplit(),
FUN_pmap(),
FUN_transpose(),
FUN_asplit_tidy(),
FUN_pmap_tidy(),
FUN_transpose_tidy(),
times = 1000L
)
ggplot2::autoplot(mb)
Edit: Replace select(., starts_with("a"))
with Z[,grepl("^a", colnames(Z))]
Upvotes: 7