Reputation: 39
DT:
Hteam Ateam Season HT_Points AT_Points
Grodig Salzburg 2015/2016 23 29
Rapid Vienna Altach 2015/2016 38 15
Ried Austria Vienna 2015/2016 32 30
Sturm Graz Mattersburg 2015/2016 30 17
Admira Rapid Vienna 2015/2016 24 27
Altach Ried 2015/2016 25 10
Austria Vienna Sturm Graz 2015/2016 29 18
Mattersburg Grodig 2015/2016 22 12
Salzburg AC Wolfsberger 2015/2016 45 11
Rapid Vienna Ried 2016/2017 3 0
Altach AC Wolfsberger 2016/2017 3 0
Sturm Graz Salzburg 2016/2017 3 0
St. Polten Austria Vienna 2016/2017 0 3
Mattersburg Admira 2016/2017 0 3
Salzburg AC Wolfsberger 2016/2017 1 1
Ried Sturm Graz 2016/2017 3 0
Altach Rapid Vienna 2016/2017 6 0
Austria Vienna Mattersburg 2016/2017 3 0
Desired Output:
Hteam Ateam Season HT_Points AT_Points HT_PointsTOTAL AT_PointsTOTAL
Grodig Salzburg 2015/2016 23 29 23 + ? 29 + ?
Rapid Vienna Altach 2015/2016 38 15 38 + ? 15 + ?
Ried Austria Vienna 2015/2016 32 30 32 + ? 30 + ?
Sturm Graz Mattersburg 2015/2016 30 17 30 + ? 17 + ?
Admira Rapid Vienna 2015/2016 24 27 24 + ? 65
Altach Ried 2015/2016 25 10 40 42
Austria Vienna Sturm Graz 2015/2016 29 18 59 48
Mattersburg Grodig 2015/2016 22 12 39 35
Salzburg AC Wolfsberger 2015/2016 45 11 74 11 + ?
Rapid Vienna Ried 2016/2017 3 0 NA NA
Altach AC Wolfsberger 2016/2017 3 0 NA NA
Sturm Graz Salzburg 2016/2017 3 0 NA NA
St. Polten Austria Vienna 2016/2017 0 3 NA NA
Mattersburg Admira 2016/2017 0 3 NA NA
Salzburg AC Wolfsberger 2016/2017 1 1 1 NA
Ried Sturm Graz 2016/2017 3 0 3 3
Altach Rapid Vienna 2016/2017 6 0 NA 3
Austria Vienna Mattersburg 2016/2017 3 0 6 0
HT_PointsTOTAL = HT_Points + AT_Points(last game played as Ateam by Hteam)
AT_PointsTOTAL = AT_Points + HT_Points(last game played as Hteam by Ateam)
Note: ? --> It should be a number.
It has been put like this since the rows it refers to are not shown.
NA --> No previous game on that Season by Hteam as Ateam or by Ateam as Hteam.
I know what to look for a value from a previous row you can use shift. But in this case I do not know how to do it since the name of the team is the same but in different columns(Hteam & Ateam).
Maybe with shift you can not do what I want to do. The goal is to add the total points of a team. That is, when playing at home, you have to look for the points from the last time that the team played as a visitor and add them(and vice versa).
Maybe the only solution is to use a function to create the new columns. But I don't know how to do it.
It is necessary to use the Season column to group by.
If it is possible to use data.table package.
Upvotes: 2
Views: 102
Reputation: 1369
UPDATED
Very similar to @chinsoon12 with data.table
but avoiding using .SD
so IMO more concise:
library(data.table)
setDT(DT)
DT[, rn := .I]
# Join to get away points (i.AT_Points) for Hteam
DT[DT,
HT_PointsTOTAL := HT_Points + i.AT_Points,
on = .(Hteam=Ateam, Season=Season, rn>rn)] # note rn>rn (using greater than here)
# Join to get home points (i.HT_Points) for Ateam
DT[DT,
AT_PointsTOTAL := AT_Points + i.HT_Points,
on = .(Ateam=Hteam, Season=Season, rn<rn)] # note nr<rn (using less than here)
DT
Resulting in (for your updated sample data):
Hteam Ateam Season HT_Points AT_Points rn HT_PointsTOTAL AT_PointsTOTAL
1: Grodig Salzburg 2015/2016 23 29 1 NA NA
2: Rapid Vienna Altach 2015/2016 38 15 2 NA NA
3: Ried Austria Vienna 2015/2016 32 30 3 NA NA
4: Sturm Graz Mattersburg 2015/2016 30 17 4 NA NA
5: Admira Rapid Vienna 2015/2016 24 27 5 NA 65
6: Altach Ried 2015/2016 25 10 6 40 42
7: Austria Vienna Sturm Graz 2015/2016 29 18 7 59 48
8: Mattersburg Grodig 2015/2016 22 12 8 39 35
9: Salzburg AC Wolfsberger 2015/2016 45 11 9 74 NA
10: Rapid Vienna Ried 2016/2017 3 0 10 NA NA
11: Altach AC Wolfsberger 2016/2017 3 0 11 NA NA
12: Sturm Graz Salzburg 2016/2017 3 0 12 NA NA
13: St. Polten Austria Vienna 2016/2017 0 3 13 NA NA
14: Mattersburg Admira 2016/2017 0 3 14 NA NA
15: Salzburg AC Wolfsberger 2016/2017 1 1 15 1 NA
16: Ried Sturm Graz 2016/2017 3 0 16 3 3
17: Altach Rapid Vienna 2016/2017 6 0 17 NA 3
18: Austria Vienna Mattersburg 2016/2017 3 0 18 6 0
If you like you can remove the rn column when you're done:
DT$rn <- NULL
Upvotes: 0
Reputation: 25225
Here is an approach using data.table
non-equi join using row number to ensure that we only pick from previous rows:
library(data.table)
setDT(DT)
DT[, rn := .I]
#calculate home team points first
DT[, HT_PointsTotal :=
.SD[.SD, .(x.AT_Points + i.HT_Points), on=c("Season"="Season", "Ateam"="Hteam", "rn<rn")]]
#then calculate away team points
DT[, AT_PointsTotal :=
.SD[.SD, .(x.HT_Points + i.AT_Points), on=c("Season"="Season", "Hteam"="Ateam", "rn<rn")]]
Adding a roll
approach when dataset gets larger and there is Cartesian join error due to Hteam appearing multiple times in Ateam column.
dummy[, rn := .I]
dummy[, HT_PointsTotal :=
.SD[.SD, .(x.AT_Points + i.HT_Points), on=c("Season", "Ateam"="Hteam", "rn"), roll=Inf]
]
dummy[, AT_PointsTotal :=
.SD[.SD, .(x.HT_Points + i.AT_Points), on=c("Season", "Ateam"="Hteam", "rn"), roll=Inf]
]
dummy data (this was time consuming to create and does not reflect reality as well):
library(data.table)
DT <- fread("Hteam,Ateam,Season,HT_Points,AT_Points
Grodig,Salzburg,2015/2016,23,29
Rapid Vienna,Altach,2015/2016,38,15
Ried,Austria Vienna,2015/2016,32,30
Sturm Graz,Mattersburg,2015/2016,30,17
Admira,Rapid Vienna,2015/2016,24,27
Altach,Ried,2015/2016,25,10
Austria Vienna,Sturm Graz,2015/2016,29,18
Mattersburg,Grodig,2015/2016,22,12
Salzburg,AC Wolfsberger,2015/2016,45,11")
numTeams <- DT[,uniqueN(c(Hteam, Ateam))]
firstHalf <- lapply(seq_len(DT[,.N]),
function(n) data.table(
Matchday=n*2L-1L,
Hteam=DT[["Hteam"]],
Ateam=c(DT[["Ateam"]][-seq_len(n)], DT[["Ateam"]][seq_len(n)]),
Season=DT[["Season"]],
HT_Points=DT[["HT_Points"]],
AT_Points=DT[["AT_Points"]]
))
secondHalf <- lapply(seq_len(DT[,.N]),
function(n) data.table(
Matchday=n*2L,
Hteam=DT[["Ateam"]],
Ateam=c(DT[["Hteam"]][-seq_len(n)], DT[["Hteam"]][seq_len(n)]),
Season=DT[["Season"]],
HT_Points=DT[["HT_Points"]],
AT_Points=DT[["AT_Points"]]
))
dummy <- rbindlist(c(firstHalf, secondHalf))[
Hteam!=Ateam][,
.SD[1L], by=.(Hteam, Ateam)]
setorder(dummy, Matchday, Hteam)
Upvotes: 1