David
David

Reputation: 10192

Create column names based on "by" argument the data.table way

Say I have the following data.table

dt <- data.table(var = c("a", "b"), val = c(1, 2))

Now I want to add two new columns to dt, named a, and b with the respective values (1, 2). I can do this with a loop, but I want to do it the data.table way.

The result would be a data.table like this:

dt.res <- data.table(var = c("a", "b"), val = c(1, 2), #old vars 
                     a = c(1, NA), b = c(NA, 2)) # newly created vars

So far I came up with something like this

dt[, c(xx) := val, by = var]

where xx would be a data.table-command similar to .N which addresses the value of the by-group.

Thanks for the help!

Appendix: The for-loop way

The non-data.table-way with a for-loop instead of a by-argument would look something like this:

for (varname in dt$var){
     dt[var == varname, c(varname) := val]
}

Upvotes: 2

Views: 101

Answers (1)

akrun
akrun

Reputation: 887241

Based on the example showed, we can use dcast from the data.table to convert the long format to wide, and join with the original dataset on the 'val' column.

library(data.table)#v1.9.6+
dt[dcast(dt, val~var, value.var='val'), on='val']
#   var val  a  b
#1:   a   1  1 NA
#2:   b   2 NA  2

Or as @CathG mentioned in the comments, for previous versions either merge or set the key column and then join.

merge(dt, dcast.data.table(dt, val~var, value.var='val'))

Upvotes: 5

Related Questions