Reputation: 9825
How can you create a data.table that contains nested data.tables?
set.seed(7908)
dt <- data.table(x=1:5)[,list(y=letters[1:x],z=sample(1:100,x)),by=x]
dt
## x y z
## 1: 1 a 13
## 2: 2 a 27
## 3: 2 b 87
## 4: 3 a 85
## 5: 3 b 98
## 6: 3 c 1
## 7: 4 a 53
## 8: 4 b 81
## 9: 4 c 64
## 10: 4 d 45
## 11: 5 a 28
## 12: 5 b 26
## 13: 5 c 52
## 14: 5 d 55
## 15: 5 e 12
For each unique value of x in dt, collapse the rows and create a data.table with columns y and z that is contained in a single column of dt. The result should look like this:
## x dt.yz
## 1: 1 <data.table>
## 2: 2 <data.table>
## 3: 3 <data.table>
## 4: 4 <data.table>
## 5: 5 <data.table>
In my actual example I've got several data tables with differing columns that I want to store in a single meta data table.
Upvotes: 17
Views: 6355
Reputation: 9825
Create the data.table using y and z as the columns, and then wrap that in a list so it can be "stuffed" in a single row. Wrap that in yet another list, where you assign the resulting column name. Use by=x
to do this for each unique value of x.
dt2 <- dt[, list(dt.yz=list(data.table(y, z))), by=x]
dt2
## x dt.yz
## 1: 1 <data.table>
## 2: 2 <data.table>
## 3: 3 <data.table>
## 4: 4 <data.table>
## 5: 5 <data.table>
As Arun points out, using .SD
is shorter and faster, and may be more convenient:
dt2 <- dt[, list(dt.yz=list(.SD)), by=x]
## dt.yz will include all columns not in the `by=`;
## Use `.SDcols=` to select specific columns
To get the value of a data.table later, subset the meta data.table (dt2) based on the desired value of x, and then get the first element in the list (which is the nested data.table) of the dt.yz column.
dt2[x==5,dt.yz[[1]]]
## y z
## 1: a 28
## 2: b 26
## 3: c 52
## 4: d 55
## 5: e 12
Upvotes: 20