ira
ira

Reputation: 2644

How to unnest column of type list to wide format without using dcast

I have a data.table with column of type list. I would like to turn this list column into new columns in wide format. I can use unlist to first create a long format of the table and then use dcast to cast it to wide format (althoug i use the original list column in the process, because dcast says that Columns specified in formula can not be of type list

Example:

I have a data.table like so:

dt = data.table(
  list_col = list(c(404, 444), c(404, 444), c(444,484), c(444, 484), c(364, 404)),
  other_col = c('A', 'B', 'D', 'E', 'A'))

I can do:

# create a row id
dt[, row_id := .I]

# cast to wide
dcast(
  # replicate the data table, so that rows are matched properly when unlisting
  dt[
    rep(dt[,.I], lengths(list_col))
    ][
      ,
      # unlist the list column
      unlist_col := unlist(dt$list_col)
      ][
        ,
        # crate a row id per group so that we know how to cast to wide
        group_row_id := rowid(row_id)
        ][],
  # here i lose the original list_col, because i can't put the column of type list to the formula
  other_col + row_id ~ group_row_id, value.var = 'unlist_col')

This gives the desired result, and i can even fix the fact that i lost the list column by either joining on row_id or by sorting the resulting table by row_id and just adding the new columns to the original table, but it would be nice if there was more straightforward way for this operation.

Upvotes: 0

Views: 513

Answers (2)

Haci Duru
Haci Duru

Reputation: 456

If you have the same number of items in the list_col, you can try the following code:

> dt = cbind(dt, t(matrix(unlist(dt$list_col), ncol=nrow(dt))))
> dt
   list_col other_col  V1  V2
1:  404,444         A 404 444
2:  404,444         B 404 444
3:  444,484         D 444 484
4:  444,484         E 444 484
5:  364,404         A 364 404

Then you can change the names of the newly added columns.

Upvotes: 1

s_baldur
s_baldur

Reputation: 33508

Using data.table::transpose():

dt[, c("col1", "col2") := transpose(list_col)]

#    list_col other_col col1 col2
# 1:  404,444         A  404  444
# 2:  404,444         B  404  444
# 3:  444,484         D  444  484
# 4:  444,484         E  444  484
# 5:  364,404         A  364  404

Upvotes: 4

Related Questions