Create nested data.tables by collapsing rows into new data.tables

Question

How can you create a data.table that contains nested data.tables?

Example

set.seed(7908)
dt <- data.table(x=1:5)[,list(y=letters[1:x],z=sample(1:100,x)),by=x]

dt
##     x y  z
##  1: 1 a 13
##  2: 2 a 27
##  3: 2 b 87
##  4: 3 a 85
##  5: 3 b 98
##  6: 3 c  1
##  7: 4 a 53
##  8: 4 b 81
##  9: 4 c 64
## 10: 4 d 45
## 11: 5 a 28
## 12: 5 b 26
## 13: 5 c 52
## 14: 5 d 55
## 15: 5 e 12

Desired output

For each unique value of x in dt, collapse the rows and create a data.table with columns y and z that is contained in a single column of dt. The result should look like this:

##    x        dt.yz
## 1: 1 
## 2: 2 
## 3: 3 
## 4: 4 
## 5: 5

In my actual example I've got several data tables with differing columns that I want to store in a single meta data table.

dnlbrky · Accepted Answer

Create the data.table using y and z as the columns, and then wrap that in a list so it can be "stuffed" in a single row. Wrap that in yet another list, where you assign the resulting column name. Use by=x to do this for each unique value of x.

dt2 <- dt[, list(dt.yz=list(data.table(y, z))), by=x]
dt2
##    x        dt.yz
## 1: 1 
## 2: 2 
## 3: 3 
## 4: 4 
## 5: 5

As Arun points out, using .SD is shorter and faster, and may be more convenient:

dt2 <- dt[, list(dt.yz=list(.SD)), by=x]
## dt.yz will include all columns not in the `by=`;
## Use `.SDcols=` to select specific columns

To get the value of a data.table later, subset the meta data.table (dt2) based on the desired value of x, and then get the first element in the list (which is the nested data.table) of the dt.yz column.

dt2[x==5,dt.yz[[1]]]
##    y  z
## 1: a 28
## 2: b 26
## 3: c 52
## 4: d 55
## 5: e 12

Create nested data.tables by collapsing rows into new data.tables

Example

Desired output

Answers (1)

Related Questions