Reputation: 85
I am trying to pass column names to the following function.
unnest_dt <- function(tbl, ...) {
tbl <- as.data.table(tbl)
col <- ensyms(...)
clnms <- syms(setdiff(colnames(tbl), as.character(col)))
tbl <- as.data.table(tbl)
tbl <- eval(
expr(tbl[, lapply(.SD, unlist), by = list(!!!clnms), .SDcols = as.character(col)])
)
colnames(tbl) <- c(as.character(clnms), as.character(col))
tbl
}
The function is built for unnesting data frame with multiple list columns. Consider the following implementation of the function on a dummy data.
library(tibble)
df <- tibble(
a = LETTERS[1:5],
b = LETTERS[6:10],
list_column_1 = list(c(LETTERS[1:5]), "F", "G", "H", "I"),
list_column_2 = list(c(LETTERS[1:5]), "F", "G", "H", "I")
)
df <- unnest_dt2(df,list_column_1,list_column_2)
It serves the purpose. However, I am trying to loop over this function, and I need to pass column names to it. For example, I want to be able to do the following:
library(dplyr)
col <- colnames(df %>% select_if(is.list))
df <- unnest_dt2(df,col)
This expectedly gives the error. " Error in [.data.table
(tbl, , lapply(.SD, unlist), by = list(a, b, list_column_1, :
column or expression 3 of 'by' or 'keyby' is type list. Do not quote column names. Usage: DT[,sum(colC),by=list(colA,month(colB))] "
Would anyone know how I can proceed with this? Any help would be greatly appreciated.
Upvotes: 0
Views: 443
Reputation: 389235
You can change the function to work with character vector.
unnest_dt <- function(tbl, ...) {
tbl <- as.data.table(tbl)
col <- c(...)
clnms <- syms(setdiff(colnames(tbl), col))
tbl <- as.data.table(tbl)
tbl <- eval(
expr(tbl[, lapply(.SD, unlist), by = list(!!!clnms),
.SDcols = as.character(col)])
)
colnames(tbl) <- c(as.character(clnms), as.character(col))
tbl
}
and then use :
unnest_dt(df,col)
# a b list_column_1 list_column_2
#1: A F A A
#2: A F B B
#3: A F C C
#4: A F D D
#5: A F E E
#6: B G F F
#7: C H G G
#8: D I H H
#9: E J I I
Upvotes: 1