Michael
Michael

Reputation: 2566

Convert melt operation from tidyverse to data.table

I am working with a very wide table that I need to pivot into a longer format. Usually for such task, I am using tidyverse however the table is containing a large volume of records and I want to leverage data.table for this task.

Below you can find a sample dataset with the tidyverse version of the code and the version using data.table. I am having issue converting the columns of interest into numeric.

What is the proper way to do the conversion from tidyverse to data.table for this code?

library(data.table)
library(tidyverse)

DT = tibble(
    year_a = 1999:2020,
    year_b = 1999:2020,
    a = as.character(sample(0:1, 22, replace = TRUE)),
    b = as.character(sample(0:1, 22, replace = TRUE)), 
    c = as.character(sample(0:1, 22, replace = TRUE)),
    d = as.character(sample(0:1, 22, replace = TRUE))
)



# tidyverse version
long_DT <- DT %>%
    filter(year_a >= 2010 & year_b >= 2010) %>%
    mutate(across(a:d, .fns = as.double)) %>%
    pivot_longer(cols      = a:d,
                 names_to  = "letter",
                 values_to = "value") %>%
    clean_names()

dim(long_DT)
long_DT %>% glimpse()

# data.table
setDT(DT)
# the line after is causing problems. How to integrate it into the melt function directly?
DT[, select(.SD, a:d)] <- apply(DT[,select(.SD, a:d)], 2, function(x) as.numeric(x))
DT_long <- melt(data = DT[
                    year_a >= 2010 & year_b >= 2010],
                id.vars = c("year_a", "year_b"),
                variable.name = "letter",
                value.name = "value"
            )
dim(DT_long)
DT_long %>% glimpse()

Upvotes: 2

Views: 113

Answers (1)

zx8754
zx8754

Reputation: 56179

Maybe convert to numeric after melting:

res <- melt(DT[year_a >= 2010 & year_b >= 2010, ],
            id.vars = c("year_a", "year_b"),
            variable.name = "letter",
            value.name = "value")[, value := as.numeric(value)]

Or if we have to convert to numeric before melting:

cols <- colnames(DT)[3:6]
res <- melt(DT[year_a >= 2010 & year_b >= 2010, 
               ][, (cols) := lapply(.SD, as.numeric), .SDcols = cols],
            id.vars = c("year_a", "year_b"),
            variable.name = "letter",
            value.name = "value")

Upvotes: 3

Related Questions