Tyler Rinker
Tyler Rinker

Reputation: 109874

data.table to long based on one column of vectors repeat another column

I have a data.table with 2 columns that are a list of vectors.

   x     y        z
1: 1 1,2,3  8, 9,10
2: 2   5,6        3
3: 3 18,19      1,2

I want to stretch and unlist by one list of vectors (z`) but retain and repeat accordingly the other column that's a list of vectors. I saw how to almost do this but get an error as seen below:

library(data.table)

dat <- data.frame(
    x = 1:3,
    stringsAsFactors = FALSE
)

dat[['y']] <- list(1:3, 5:6, 18:19)
dat[['z']] <- list(8:10, 3, 1:2)

setDT(dat)

# inefficient way to get what I want 
a <- unlist(dat[['z']]) 
dat <- dat[rep(1:nrow(dat), sapply(z, length)), ]
dat[['z']] <- a

dat

   x     y  z
1: 1 1,2,3  8
2: 1 1,2,3  9
3: 1 1,2,3 10
4: 2   5,6  3
5: 3 18,19  1
6: 3 18,19  2

# trying to do this the data.table way
# Works but dropped column
dat[, .(z = as.integer(unlist(z))), by = .(x)]

# does not work (gives error)
dat[, .(z = as.integer(unlist(z))), by = .(x, y)]

Error in `[.data.table`(dat, , .(z = as.integer(unlist(z))), by = .(x,  : 
  column or expression 2 of 'by' or 'keyby' is type list. Do not quote column names. Usage: DT[,sum(colC),by=list(colA,month(colB))]

Upvotes: 4

Views: 188

Answers (1)

eddi
eddi

Reputation: 49448

Simply add the y column to your j-expression:

dat[, .(y, z = as.integer(unlist(z))), by = x]
#   x     y  z
#1: 1 1,2,3  8
#2: 1 1,2,3  9
#3: 1 1,2,3 10
#4: 2   5,6  3
#5: 3 18,19  1
#6: 3 18,19  2

Upvotes: 6

Related Questions