Gecko
Gecko

Reputation: 390

r data.table output in columns

I am slowly learning data.table but I am struggling with getting the output format I need.

> z = data.table(X=c(1,1,1,2,2,2), Y=c("a","a","b","a","b","b"), Z=c(2,4,3,5,0,7))
> z
   X Y Z
1: 1 a 2
2: 1 a 4
3: 1 b 3
4: 2 a 5
5: 2 b 0
6: 2 b 7
> ansz <- z[, mean(Z), .(X, Y)]
> ansz
   X Y  V1
1: 1 a 3.0
2: 1 b 3.0
3: 2 a 5.0
4: 2 b 3.5

This is great but I would like to have the output split into columns by variable Y (see below). I am thinking lapply() but cannot figure it out.

   X Y=a Y=b
1: 1 3.0 3.0
2: 2 5.0 3.5

Upvotes: 1

Views: 124

Answers (2)

s_baldur
s_baldur

Reputation: 33498

Chain dcast() to what you have already done:

z[, mean(Z), .(X, Y)
  ][, dcast(.SD, X ~ paste0("y=", Y))]

   X y=a y=b
1: 1   3 3.0
2: 2   5 3.5

Or directly use dcast() with fun = mean:

dcast(z, X ~ paste0("y=", Y), fun=mean)

   X y=a y=b
1: 1   3 3.0
2: 2   5 3.5

Upvotes: 0

Frank Zhang
Frank Zhang

Reputation: 1688

You almost get there.

library(data.table)
dt <- fread('X Y  V1
1 a 3.0
1 b 3.0
2 a 5.0
2 b 3.5')

dcast(dt,X~Y,value.var = "V1")
#>    X a   b
#> 1: 1 3 3.0
#> 2: 2 5 3.5

Then you can just rename columns

Or you can create the names first.

library(data.table)
dt <- fread('X Y  V1
1 a 3.0
1 b 3.0
2 a 5.0
2 b 3.5')


dt[,new_names:=ifelse(Y=="a","Y=a","Y=b")]

dcast(dt,X~new_names,value.var = "V1")
#>    X Y=a Y=b
#> 1: 1   3 3.0
#> 2: 2   5 3.5

Upvotes: 1

Related Questions