Reputation: 2566
I am working with a very wide table that I need to pivot into a longer format. Usually for such task, I am using tidyverse
however the table is containing a large volume of records and I want to leverage data.table
for this task.
Below you can find a sample dataset with the tidyverse version of the code and the version using data.table
.
I am having issue converting the columns of interest into numeric.
What is the proper way to do the conversion from tidyverse to data.table for this code?
library(data.table)
library(tidyverse)
DT = tibble(
year_a = 1999:2020,
year_b = 1999:2020,
a = as.character(sample(0:1, 22, replace = TRUE)),
b = as.character(sample(0:1, 22, replace = TRUE)),
c = as.character(sample(0:1, 22, replace = TRUE)),
d = as.character(sample(0:1, 22, replace = TRUE))
)
# tidyverse version
long_DT <- DT %>%
filter(year_a >= 2010 & year_b >= 2010) %>%
mutate(across(a:d, .fns = as.double)) %>%
pivot_longer(cols = a:d,
names_to = "letter",
values_to = "value") %>%
clean_names()
dim(long_DT)
long_DT %>% glimpse()
# data.table
setDT(DT)
# the line after is causing problems. How to integrate it into the melt function directly?
DT[, select(.SD, a:d)] <- apply(DT[,select(.SD, a:d)], 2, function(x) as.numeric(x))
DT_long <- melt(data = DT[
year_a >= 2010 & year_b >= 2010],
id.vars = c("year_a", "year_b"),
variable.name = "letter",
value.name = "value"
)
dim(DT_long)
DT_long %>% glimpse()
Upvotes: 2
Views: 113
Reputation: 56179
Maybe convert to numeric after melting:
res <- melt(DT[year_a >= 2010 & year_b >= 2010, ],
id.vars = c("year_a", "year_b"),
variable.name = "letter",
value.name = "value")[, value := as.numeric(value)]
Or if we have to convert to numeric before melting:
cols <- colnames(DT)[3:6]
res <- melt(DT[year_a >= 2010 & year_b >= 2010,
][, (cols) := lapply(.SD, as.numeric), .SDcols = cols],
id.vars = c("year_a", "year_b"),
variable.name = "letter",
value.name = "value")
Upvotes: 3