Reputation:
I have dataset that looks like below:
customerid product store Date Sales
1 A X1 1/2/2013 4
1 B x2 1/9/2013 4
1 A x2 1/9/2013 4
1 C x1 1/16/2013 2
1 B x1 1/16/2013 2
1 A x1 1/16/2013 4
2 A x1 1/23/2013 2
2 B x2 1/30/2013 2
2 C x1 2/6/2013 2
2 D x3 2/13/2013 4
I need to transpose by Product, such that all products appear as columns..like below..
customerid Date Store A B C D
1 1/2/2013 x1 4
1 1/9/2013 x2 4 4
1 1/16/2013 x1 4 2 2
2 1/23/2013 x1 2
2 1/30/2013 x2 2
2 2/6/2013 x1 2
2 2/13/2013 x3 4
Please help! I am trying to work with the transpose function, and i tried unsuccesfully to read through some threads here, but to no avail
Thanks!
Upvotes: 1
Views: 153
Reputation: 193517
You can use dcast
from "reshape2".
library(reshape2)
dcast(mydf, customerid + store + Date ~ product, value.var="Sales")
# customerid store Date A B C D
# 1 1 x1 1/16/2013 4 2 2 NA
# 2 1 X1 1/2/2013 4 NA NA NA
# 3 1 x2 1/9/2013 4 4 NA NA
# 4 2 x1 1/23/2013 2 NA NA NA
# 5 2 x1 2/6/2013 NA NA 2 NA
# 6 2 x2 1/30/2013 NA 2 NA NA
# 7 2 x3 2/13/2013 NA NA NA 4
If you wanted to have "" instead of NA
, you can do that too, but note that you would have coerced those columns to character
.
dcast(mydf, customerid + store + Date ~ product, value.var="Sales", fill="")
# customerid store Date A B C D
# 1 1 x1 1/16/2013 4 2 2
# 2 1 X1 1/2/2013 4
# 3 1 x2 1/9/2013 4 4
# 4 2 x1 1/23/2013 2
# 5 2 x1 2/6/2013 2
# 6 2 x2 1/30/2013 2
# 7 2 x3 2/13/2013 4
For a base R solution, you can use reshape()
:
reshape(mydf, direction = "wide",
idvar = c("customerid", "store", "Date"),
timevar = "product")
# customerid store Date Sales.A Sales.B Sales.C Sales.D
# 1 1 X1 1/2/2013 4 NA NA NA
# 2 1 x2 1/9/2013 4 4 NA NA
# 4 1 x1 1/16/2013 4 2 2 NA
# 7 2 x1 1/23/2013 2 NA NA NA
# 8 2 x2 1/30/2013 NA 2 NA NA
# 9 2 x1 2/6/2013 NA NA 2 NA
# 10 2 x3 2/13/2013 NA NA NA 4
Another possibility is to use model.matrix
(thanks @Thomas for explaining the model.matrix
approach in a recent Q & A):
# cbind(mydf, model.matrix(~ 0 + product, data = mydf) * mydf$Sales)
# customerid product store Date Sales productA productB productC productD
# 1 1 A X1 1/2/2013 4 4 0 0 0
# 2 1 B x2 1/9/2013 4 0 4 0 0
# 3 1 A x2 1/9/2013 4 4 0 0 0
# 4 1 C x1 1/16/2013 2 0 0 2 0
# 5 1 B x1 1/16/2013 2 0 2 0 0
# 6 1 A x1 1/16/2013 4 4 0 0 0
# 7 2 A x1 1/23/2013 2 2 0 0 0
# 8 2 B x2 1/30/2013 2 0 2 0 0
# 9 2 C x1 2/6/2013 2 0 0 2 0
# 10 2 D x3 2/13/2013 4 0 0 0 4
Upvotes: 1