Reputation: 323
I have a data.frame like this:
P Stat V Points
1 Goals 2 10
1 Assists 1 3
2 Goals 1 5
2 Assists 1 3
and I would like to convert it to something like this:
P Goals Assists Points
1 2 1 13
2 1 1 8
Currently I am using dcast, as follows:
dcast(stats, P ~ Stat, value.var = "V")
, which works just without the "Points". When I add Points in, it starts duplicating the new columns names with _1, _2 etc.
Any help appreciated. This is not a school project, I'm just a curious consultant trying to refresh my statistics skill-set on problem that interests me!
Upvotes: 1
Views: 53
Reputation: 42544
There are (at least) two possibilities to achieve the desired result.
The recent versions of data.table
allow to specify multiple value.var
as parameter to dcast()
:
library(data.table) # version 1.10.4 used
dcast(DT, P ~ Stat, value.var = list("V", "Points"))
# P V_Assists V_Goals Points_Assists Points_Goals
#1: 1 1 2 3 10
#2: 2 1 1 3 5
If only one Points
column is desired, the points need to be added and the unnecessary columns to be deleted. By chaining, this can be done in one statement but isn't very concise.
dcast(DT, P ~ Stat, value.var = list("V", "Points"))[
, Points := Points_Assists + Points_Goals][
, c("Points_Assists", "Points_Goals") := NULL][]
# P V_Assists V_Goals Points
#1: 1 1 2 13
#2: 2 1 1 8
Alternatively, the dcast of V
and the aggregation of points can be done in seperate steps and the results being joined afterwards:
# dcast
temp1 <- dcast(DT, P ~ Stat, value.var = "V")
temp1
# P Assists Goals
#1: 1 1 2
#2: 2 1 1
# sum points by P
temp2 <- DT[, .(Points = sum(Points)), by = P]
temp2
# P Points
#1: 1 13
#2: 2 8
Now the two results need to be joined:
temp1[temp2, on = "P"]
# P Assists Goals Points
#1: 1 1 2 13
#2: 2 1 1 8
or combined in one statement:
dcast(DT, P ~ Stat, value.var = "V")[DT[, .(Points = sum(Points)), by = P], on = "P"]
The code looks more straightforward and concise than the first variant.
library(data.table)
DT <- fread(
"P Stat V Points
1 Goals 2 10
1 Assists 1 3
2 Goals 1 5
2 Assists 1 3")
Note that fread()
returns a data.table object by default. In case DT
is still a data.frame it needs to be coerced to data.table by
setDT(DT)
Upvotes: 1
Reputation: 887048
We can do a dcast
and then add the 'Points' column
library(data.table)
dcast(setDT(d1), P~Stat, value.var = "V")[, Points := sum(d1$Points)][]
# P Assists Goals Points
#1: 1 1 2 13
Upvotes: 1