Reputation: 390
I am slowly learning data.table but I am struggling with getting the output format I need.
> z = data.table(X=c(1,1,1,2,2,2), Y=c("a","a","b","a","b","b"), Z=c(2,4,3,5,0,7))
> z
X Y Z
1: 1 a 2
2: 1 a 4
3: 1 b 3
4: 2 a 5
5: 2 b 0
6: 2 b 7
> ansz <- z[, mean(Z), .(X, Y)]
> ansz
X Y V1
1: 1 a 3.0
2: 1 b 3.0
3: 2 a 5.0
4: 2 b 3.5
This is great but I would like to have the output split into columns by variable Y (see below). I am thinking lapply() but cannot figure it out.
X Y=a Y=b
1: 1 3.0 3.0
2: 2 5.0 3.5
Upvotes: 1
Views: 124
Reputation: 33498
Chain dcast()
to what you have already done:
z[, mean(Z), .(X, Y)
][, dcast(.SD, X ~ paste0("y=", Y))]
X y=a y=b
1: 1 3 3.0
2: 2 5 3.5
Or directly use dcast()
with fun = mean
:
dcast(z, X ~ paste0("y=", Y), fun=mean)
X y=a y=b
1: 1 3 3.0
2: 2 5 3.5
Upvotes: 0
Reputation: 1688
You almost get there.
library(data.table)
dt <- fread('X Y V1
1 a 3.0
1 b 3.0
2 a 5.0
2 b 3.5')
dcast(dt,X~Y,value.var = "V1")
#> X a b
#> 1: 1 3 3.0
#> 2: 2 5 3.5
Then you can just rename columns
Or you can create the names first.
library(data.table)
dt <- fread('X Y V1
1 a 3.0
1 b 3.0
2 a 5.0
2 b 3.5')
dt[,new_names:=ifelse(Y=="a","Y=a","Y=b")]
dcast(dt,X~new_names,value.var = "V1")
#> X Y=a Y=b
#> 1: 1 3 3.0
#> 2: 2 5 3.5
Upvotes: 1