Reputation: 14400
I have a data.table DT
set.seed(1)
DT <- data.table(x=rep(c(1,2,3),each=4), y=c("A","B"), v=sample(1:100,12))
DT
x y v
1: 1 A 29
2: 1 B 92
3: 1 A 100
4: 1 B 82
5: 2 A 28
6: 2 B 26
7: 2 A 18
8: 2 B 22
9: 3 A 30
10: 3 B 96
11: 3 A 15
12: 3 B 4
I would like to expand it like bellow, creating a new column for each value of x
and reporting the v
values, no structure in the data should be expected (not by blocks like bellow)
x y v.1 v.2 v.3
1: 1 A 29 NA NA
2: 1 B 92 NA NA
3: 1 A 100 NA NA
4: 1 B 82 NA NA
5: 2 A NA 28 NA
6: 2 B NA 26 NA
7: 2 A NA 18 NA
8: 2 B NA 22 NA
9: 3 A NA NA 30
10: 3 B NA NA 96
11: 3 A NA NA 15
12: 3 B NA NA 4
I asked a very similar question here but cannot adapt the answer G Grothendieck gave us at the time...
EDIT: As usual I just almost got it after I wrote the post... I just need to replace those 0 by NA (I might get 0 in v and I want to be able to dissociate v==0 from missing items)
DT2 <- DT[, {SUM.<-factor(x); data.table(model.matrix(~ SUM.:v + 0))}]
txtR) DT2
SUM.1:v SUM.2:v SUM.3:v
1: 29 0 0
2: 92 0 0
3: 100 0 0
4: 82 0 0
5: 0 28 0
6: 0 26 0
7: 0 18 0
8: 0 22 0
9: 0 0 30
10: 0 0 96
11: 0 0 15
12: 0 0 4
Upvotes: 3
Views: 160
Reputation: 49448
You could simply loop over the x
's and use data.table
assignment:
setkey(DT, x)
for (i in unique(DT$x)) {
DT[J(i), paste0("v.", i) := v]
}
P.S. I really wish the following worked, but .GRP
is not available there:
DT[, paste0("v.", .GRP) := v, by = x]
edit one more solution (trying to get the above .GRP
idea to work somehow), using rbind.fill
(I didn't run very careful benches, but this seemed to scale quite well)
library(plyr)
cbind(DT,
rbind.fill(DT[, list(list(setnames(data.table(v), paste0("v.", .GRP)))),
by = x]$V1))
Upvotes: 3
Reputation: 66819
set.seed(1)
DT <- data.table(x=rep(c(1,2,3),each=4), y=c("A","B"), v=sample(1:100,12))
This gives
x y v
1: 1 A 27
2: 1 B 37
3: 1 A 57
4: 1 B 89
5: 2 A 20
6: 2 B 86
7: 2 A 97
8: 2 B 62
9: 3 A 58
10: 3 B 6
11: 3 A 19
12: 3 B 16
Next, the answer:
ux <- unique(DT$x)
DT[,c(v.=lapply(ux,function(i)v[x==i])),by="x,y"]
which gives
x y v.1 v.2 v.3
1: 1 A 27 NA NA
2: 1 A 57 NA NA
3: 1 B 37 NA NA
4: 1 B 89 NA NA
5: 2 A NA 20 NA
6: 2 A NA 97 NA
7: 2 B NA 86 NA
8: 2 B NA 62 NA
9: 3 A NA NA 58
10: 3 A NA NA 19
11: 3 B NA NA 6
12: 3 B NA NA 16
That answer might break in later versions of R, but the OP pointed out that this works too and may be faster:
DT[,paste0("v.",ux):=lapply(ux,function(i)v[x==i]),by="x"]
Upvotes: 4
Reputation: 118889
Here's one way:
tt <- model.matrix(data=DT, ~ factor(x):rep(1, nrow(DT)) + 0)
tt[tt==0] <- NA
cbind(DT, DT$v * tt)
# x y v factor(x)1:v factor(x)2:v factor(x)3:v
# 1: 1 A 69 69 NA NA
# 2: 1 B 39 39 NA NA
# 3: 1 A 76 76 NA NA
# 4: 1 B 49 49 NA NA
# 5: 2 A 100 NA 100 NA
# 6: 2 B 95 NA 95 NA
# 7: 2 A 36 NA 36 NA
# 8: 2 B 73 NA 73 NA
# 9: 3 A 86 NA NA 86
# 10: 3 B 20 NA NA 20
# 11: 3 A 59 NA NA 59
# 12: 3 B 12 NA NA 12
Upvotes: 3