r data.table output in columns

Question

I am slowly learning data.table but I am struggling with getting the output format I need.

> z = data.table(X=c(1,1,1,2,2,2), Y=c("a","a","b","a","b","b"), Z=c(2,4,3,5,0,7))
> z
   X Y Z
1: 1 a 2
2: 1 a 4
3: 1 b 3
4: 2 a 5
5: 2 b 0
6: 2 b 7
> ansz <- z[, mean(Z), .(X, Y)]
> ansz
   X Y  V1
1: 1 a 3.0
2: 1 b 3.0
3: 2 a 5.0
4: 2 b 3.5

This is great but I would like to have the output split into columns by variable Y (see below). I am thinking lapply() but cannot figure it out.

   X Y=a Y=b
1: 1 3.0 3.0
2: 2 5.0 3.5

Frank Zhang · Accepted Answer

You almost get there.

library(data.table)
dt <- fread('X Y  V1
1 a 3.0
1 b 3.0
2 a 5.0
2 b 3.5')

dcast(dt,X~Y,value.var = "V1")
#>    X a   b
#> 1: 1 3 3.0
#> 2: 2 5 3.5

Then you can just rename columns

Or you can create the names first.

library(data.table)
dt <- fread('X Y  V1
1 a 3.0
1 b 3.0
2 a 5.0
2 b 3.5')


dt[,new_names:=ifelse(Y=="a","Y=a","Y=b")]

dcast(dt,X~new_names,value.var = "V1")
#>    X Y=a Y=b
#> 1: 1   3 3.0
#> 2: 2   5 3.5

r data.table output in columns

Answers (2)

Related Questions