Jdv
Jdv

Reputation: 329

Adding rows and updating following rows conditionally

I'd like to add a row before certain condition in my data frame. In this case, every time Column "Days" is >= 3, I'd like to insert a row before.

The problem involves traveling from one city to another, so if the number of days between travels is >= 3 I'd like for that team to return to its home city, and then start the next travel from there. Therefore, the new row should reflect the correct coordinates for the home city, as well as to update the coordinates for the next trip (next row) since the team will be traveling from a different city.

Below is an example of the dataset

Date <- c("2017-10-20", "2017-10-22", "2017-10-25", "2017-10-26", "2017-10-27", "2017-11-01", "2017-11-03")
Team <-  c("Team A", "Team A", "Team A", "Team A", "Team A", "Team A", "Team A")
Opponent <- c("Opp 1 ", "Opp 2", "Opp 3", "Opp 4", "Opp 5", "Opp 6", "Opp 7")
Location <- c("Away", "Away", "Away", "Away", "Home", "Away", "Home")
City <- c("Opp 1 City",  "Opp 2 City", "Opp 3 City", "Opp 4 City", "Home City", "Opp 6 City", "Home City")
lat <- c(35.20, 40.67, 25.78, 41.84, 33.76, 40.01, 33.76)
lon <- c(-80.83, -73.94, -80.21, -87.68, -84.42, -75.13, -84.42)
destlat <- c(32.79, 35.20, 40.67, 25.78, 41.84, 33.76, 40.01)
destlon <- c(-96.77, -80.83, -73.94, -80.21, -87.68, -84.42, -75.13)
Days <- c(NA, 2, 3, 1, 1, 5, 2)

df <- tibble(Date, Team, Opponent, Location, City, lat, lon, destlat, destlon, Days) %>% 
  dplyr::mutate(Date = as.Date(Date))

df

And here a manually added solution of what the correct output should look like after adding the conditional row.

# desired output

Date <- c("2017-10-20", "2017-10-22", "2017-10-24", "2017-10-25", "2017-10-26", "2017-10-27", "2017-11-01", "2017-11-03")
Team <-  c("Team A", "Team A", "Team A", "Team A", "Team A", "Team A", "Team A", "Team A")
Opponent <- c("Opp 1 ", "Opp 2", NA, "Opp 3", "Opp 4", "Opp 5", "Opp 6", "Opp 7")
Location <- c("Away", "Away", NA, "Away", "Away", "Home", "Away", "Home")
City <- c("Opp 1 City",  "Opp 2 City", "Home City", "Opp 3 City" ,"Opp 4 City", "Home City", "Opp 6 City", "Home City")
lat <- c(35.20, 40.67, 33.76, 25.78, 41.84, 33.76, 40.01, 33.76)
lon <- c(-80.83, -73.94, -84.41, -80.21, -87.68, -84.42, -75.13, -84.42)
destlat <- c(32.79, 35.20, 40.67, 33.76, 25.78, 41.84, 33.76, 40.01)
destlon <- c(-96.77, -80.83, -73.94, -84.42, -80.21, -87.68, -84.42, -75.13)
Days <- c(NA, 2, NA, 3, 1, 1, 5, 2)

df <- tibble(Date, Team, Opponent, Location, City, lat, lon, destlat, destlon, Days) %>% 
  dplyr::mutate(Date = as.Date(Date))

df

Ideally looking for a dplyr solution but open to other options if needed.

Upvotes: 0

Views: 150

Answers (2)

Eric Krantz
Eric Krantz

Reputation: 2189

This solution works for multiple teams:

# Create home team location df
Team <- c("Team A", "Team B")
lat <- c(33.8, 50)
lon <- c(-84.4, -90)
df_loc <- tibble(Team, lat, lon)

# function to create new row
new.row <- function(x) {
    Date <- x$Date - 1
    Team <- x$Team
    City <- x$City
    lat <- df_loc$lat[df_loc$Team == Team]
    lon <- df_loc$lon[df_loc$Team == Team]
    destlat <- x$destlat
    destlon <- x$destlon
    new_row <- tibble(Date = Date, 
                      Team = Team,
                      Opponent = NA,
                      Location = NA,
                      City = City,
                      lat = lat,
                      lon = lon,
                      destlat = destlat,
                      destlon = destlon,
                      Days = NA)
} 

# find the row where Days >= 3
row_loc <- which(df$Days >= 3)[1]

# call the function
new_row <- new.row(df[row_loc, ])

# add the row
df <- df %>% add_row(new_row, .before = row_loc)

# set the old Days (>=3) to 1
df$Days[row_loc +1] <- 1

df

Upvotes: 2

iod
iod

Reputation: 7592

Not the most elegant of solutions, but what I did is I created the new rows that are needed by modifying existing rows in a filtered dataframe, and then bound them back with everything else.

bind_rows(
  filter(df,Days>=3) %>% mutate(destlat="33.76", destlon="-84.42"),
  filter(df,Days>=3 & lag(Location!="Home")| lead(Days>=3) & Location!="Home") %>% 
    mutate(Date=lag(Date)+lag(Days), Location="Home",City="Home City",destlat=lag(lat),destlon=lag(lon),lat="33.76",lon="-84.42",Opponent=NA) %>% 
    filter(Days>=3) %>% 
    mutate(Days=NA),
  filter(df,Days<3)) %>%
  arrange(Date)

# A tibble: 7 x 10
  Date       Team   Opponent Location City       lat   lon    destlat destlon  Days
  <date>     <chr>  <chr>    <chr>    <chr>      <chr> <chr>  <chr>   <chr>   <dbl>
1 2017-10-22 Team A Opp 2    Away     Opp 2 City 40.67 -73.94 35.20   -80.83      2
2 2017-10-24 Team A NA       Home     Home City  33.76 -84.42 40.67   -73.94     NA
3 2017-10-25 Team A Opp 3    Away     Opp 3 City 25.78 -80.21 33.76   -84.42      3
4 2017-10-26 Team A Opp 4    Away     Opp 4 City 41.84 -87.68 25.78   -80.21      1
5 2017-10-27 Team A Opp 5    Home     Home City  33.76 -84.42 41.84   -87.68      1
6 2017-11-01 Team A Opp 6    Away     Opp 6 City 40.01 -75.13 33.76   -84.42      5
7 2017-11-03 Team A Opp 7    Home     Home City  33.76 -84.42 40.01   -75.13      2

Upvotes: 1

Related Questions