Reputation: 65
I hava a data.table DF as following. Third coulmn is the team which have the ball possesion at that time (described by minutes and secs).
minute second teamId isGoal
1: 10 22 Atletico MG 0
2: 10 26 Atletico MG 0
3: 10 30 Atletico MG 0
4: 10 33 Atletico MG 0
5: 10 35 Atletico MG 0
6: 10 37 Atletico MG 0
7: 10 38 Atletico MG 1
8: 10 40 Atletico GO 0
9: 10 42 Atletico GO 0
10: 10 48 Atletico GO 1
11: 10 51 Atletico MG 0
12: 10 54 Atletico MG 1
13: 10 60 Atletico MG 0
I would like to create twou new columns. Each one summing number of the goals of each team. For exemple, the output should be:
minute second teamId isGoal AtleticoMG AtleticoGO
1: 10 22 Atletico MG 0 0 0
2: 10 26 Atletico MG 0 0 0
3: 10 30 Atletico MG 0 0 0
4: 10 33 Atletico MG 0 0 0
5: 10 35 Atletico MG 0 0 0
6: 10 37 Atletico MG 0 0 0
7: 10 38 Atletico MG 1 1 0
8: 10 40 Atletico GO 0 1 0
9: 10 42 Atletico GO 0 1 0
10: 10 48 Atletico GO 1 1 1
11: 10 51 Atletico MG 0 1 1
11: 10 51 Atletico MG 0 1 1
12: 10 54 Atletico MG 1 2 1
13: 10 60 Atletico MG 0 2 1
A want to avoid for loops. I'm sure it must easy to do in data.table, but how?
Upvotes: 0
Views: 40
Reputation: 599
Here is a solution using dplyr
and a "toy" example that is similar to the data frame you give in the opening post.
Basically, basically you want a column for each team with indicators that are one if that team has scored a goal in a particular row of the dataframe. Then you use the cumsum()
function to cumulate within these newly-generated columns.
library(dplyr)
x <- data.frame( teamID=c('A', 'A', 'B', 'A', 'A', 'B', 'B', 'B', 'A'),
isGoal=c(0,0,1,0,1,0,0,1, 0) ) %>%
mutate( AGoal = cumsum( isGoal*(teamID=='A') ),
BGoal = cumsum( isGoal*(teamID=='B') ) )
Output:
teamID isGoal AGoal BGoal
A 0 0 0
A 0 0 0
B 1 0 1
A 0 0 1
A 1 1 1
B 0 1 1
B 0 1 1
B 1 1 2
A 0 1 2
Upvotes: 1