Reputation: 23
I’m trying to figure out how to make a calculation across (or between?) rows. I’ve tried looking this up but clearly my Google-Fu is not strong today, because I’m just not finding the right search terms.
Here is a super simplified example of the type of data I’m trying to deal with:
mydf <- data.frame(pair = rep(1,2),
participant = c("PartX", "PartY"),
goalsAtt = c(6, 3),
goalsScr = c(2, 3))
We have data on how many "goals" a participant attempted and how many they actually scored, and lets say I want to know about their "defensive" ability. Now essentially what I want to do is mutate()
two new columns called… let’s say saved
and missed
, where saved
would be the goals attempted by the opposite participant minus the goals scored by them, and missed
would just be goals scored by the opposite participant. So obviously participant X would have saved 0 and missed 3, and participant Y will have saved 4 and missed 2.
Now obviously this is a stupid simple example, and I’ll have like 6 different “goal” types to deal with and 2.5k pairs to go through, but I’m just having a mental block about where to start with this.
Amateur coder here, and Tidyverse style solutions are appreciated.
Upvotes: 2
Views: 248
Reputation: 1702
OK, so let's assume that pair
can only be for 2 teams. Here's a tidyverse
solution where we first set an index number for position within a group and then subtract for goals saved. Something similar for goals missed.
library(tidyverse)
mydf %>%
group_by(pair) %>%
mutate(id = row_number()) %>%
mutate(goalsSaved = if_else(id == 1,
lead(goalsAtt) - lead(goalsScr),
lag(goalsAtt) - lag(goalsScr))) %>%
mutate(goalsMissed = if_else(id == 1,
lead(goalsScr),
lag(goalsScr)))
# A tibble: 2 x 7
# Groups: pair [1]
pair participant goalsAtt goalsScr id goalsSaved goalsMissed
<dbl> <fct> <dbl> <dbl> <int> <dbl> <dbl>
1 1 PartX 6 2 1 0 3
2 1 PartY 3 3 2 4 2
Upvotes: 2