DarthVinci52
DarthVinci52

Reputation: 33

Find next 2 rows with same value and paste rows into new columns behind original entry with dplyr

This is the first time I am posting a question here, so please be gentle :)
I have a data frame with goal and corner statistics from the English football/soccer league (Premier League) with one game per row. head(premierleague) will give you something like this (made up data):

| Home          | Home_goals    | Away          | Away_goals    | Home_Corners  | Away_Corners  |
|------------   |------------   |-----------    |------------   |-------------- |-------------- |
| Tottenham     | 1             | Arsenal       | 0             | 5             | 2             |
| Man United    | 2             | Watford       | 1             | 7             | 4             |
| Man City      | 3             | West Ham      | 0             | 10            | 2             |
| Chelsea       | 2             | Arsenal       | 1             | 7             | 6             |
| Tottenham     | 4             | Norwich       | 1             | 6             | 0             |
| Man United    | 2             | Liverpool     | 2             | 4             | 7             |
| Tottenham     | 0             | Man City      | 2             | 3             | 8             |

I would like to find for each entry in the column Home (first one is in this case Tottenham) the next 2 matching entries (rows 5 and 7) and paste them into new columns in row 1.
I want to do this for each row in my data frame and also keep all rows. I just want to add the statistics of the next two games as new columns:
Home_2
Home_goals_2
Away_2 and so on.

I honestly have no idea how to even google this and as far as my experience goes with stackoverflow, I am sure some of you will solve that within minutes :) Any help is highly appreciated.

Thanks so much in advance
Philipp

Edit:

Don't really know if I can attach something here, but the dataframe is this:

premierleague <- data.frame("Home" = c("Tottenham", "ManUnited", "ManCity", "Chelsea", "Tottenham", "ManUnited", "Tottenham"), 
                            "Home_goals" = c(1,2,3,2,4,2,0), 
                            "Away" = c ("Arsenal", "Watford", "Westham", "Arsenal", "Norwich", "Liverpool", "ManCity"), 
                            "Away_goals" = c(0,1,0,1,1,2,2), 
                            "Home_corners" = c(5,7,10,7,6,4,3), 
                            "Away_corners" = c(2,4,2,6,0,7,8))

### The desired result looks like this

premierleague_new <- data.frame(
  "Home" = c("Tottenham", "ManUnited", "ManCity", "Chelsea", "Tottenham", "ManUnited", "Tottenham"), 
  "Home_goals" = c(1,2,3,2,4,2,0), 
  "Away" = c("Arsenal", "Watford", "Westham", "Arsenal", "Norwich", "Liverpool", "ManCity"), 
  "Away_goals" = c(0,1,0,1,1,2,2), 
  "Home_corners" = c(5,7,10,7,6,4,3), 
  "Away_corners" = c(2,4,2,6,0,7,8),
  "Home_goals_2" = c(4,2,NA,NA,0,NA,NA),
  "Away_2" = c("Norwich", "Liverpool",NA,NA,"ManCity",NA,NA),
  "Away_goal_2" = c(1,2,NA,NA,2,NA,NA),
  "Home_corn_2" = c(6,4,NA,NA,3,NA,NA),
  "Away_corn_2" = c(0,7,NA,NA,8,NA,NA),
  "Home_goal_3" = c(0,NA,NA,NA,NA,NA,NA),
  "Away_3" = c("ManCity",NA,NA,NA,NA,NA,NA),
  "Away_goal_3" = c(2,NA,NA,NA,NA,NA,NA),
  "Home_corners_3" = c(3,NA,NA,NA,NA,NA,NA),
  "Away_corners_3" = c(8,NA,NA,NA,NA,NA,NA)
                                 )

Tottenham is the only team with all in all 3 games, so all columns are filled for Tottenham in row 1.

The 2nd entry for Tottenham in row 5 has only values for the second game because there is only one below that with Tottenham as home team in this example.

I hope, it's clearer now. Should be at least reproduceable.

Upvotes: 3

Views: 66

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389012

We can group_by Home and use lead to get values from next rows.

library(dplyr)

premierleague %>%
  group_by(Home) %>%
  mutate_at(vars(Home_goals:Away_corners), list(`2` = ~lead(.), `3` = ~lead(., 2)))


#  Home  Home_goals Away  Away_goals Home_corners Away_corners Home_goals_2 Away_2
#  <fct>      <dbl> <fct>      <dbl>        <dbl>        <dbl>        <dbl> <fct> 
#1 Tott…          1 Arse…          0            5            2            4 Norwi…
#2 ManU…          2 Watf…          1            7            4            2 Liver…
#3 ManC…          3 West…          0           10            2           NA NA    
#4 Chel…          2 Arse…          1            7            6           NA NA    
#5 Tott…          4 Norw…          1            6            0            0 ManCi…
#6 ManU…          2 Live…          2            4            7           NA NA    
#7 Tott…          0 ManC…          2            3            8           NA NA    
# … with 8 more variables: Away_goals_2 <dbl>, Home_corners_2 <dbl>,
#   Away_corners_2 <dbl>, Home_goals_3 <dbl>, Away_3 <fct>, Away_goals_3 <dbl>,
#   Home_corners_3 <dbl>, Away_corners_3 <dbl>

Upvotes: 1

Related Questions