Reputation: 307
My data set looks like this:
Date/Time Teams xScore
Saturday, Jan 16th 2021 NA NA
10:30 AM Fulham 0.9125
NA Chelsea 1.1634
NA Draw NA
NA NA NA
1:00 PM Leicester 1.4562
NA Southampton 1.4613
NA Draw NA
NA NA NA
Sunday, Jan 17th 2021 NA NA
6:00 AM Sheffield Utd 0.8965
NA Tottenham 1.6542
NA Draw NA
NA NA NA
I am trying to add a new column that performs a calculation where xScore is not NA. Here is a simple version of what I am trying to do:
for (i in 1:length(mytable$xScore)){
if(is.na(mytable$xScore[i]) != TRUE){
if(is.na(mytable$xScore[i+1]) != TRUE){
mytable$HomeAway[i] <- "Home"
mytable$HomeAway[i+1] <- "Away"
}
}
}
When I run the forloop I get this error: "replacement has 2 rows, data has 49"
It seems to me the loop is just stopping when the two if criteria are false. I am not sure how to get it to add a column of home and away for each Team. I want it to look like this:
Date/Time Teams xScore HomeAway
Saturday, Jan 16th 2021 NA NA NA
10:30 AM Fulham 0.9125 Home
NA Chelsea 1.1634 Away
NA Draw NA NA
NA NA NA NA
1:00 PM Leicester 1.4562 Home
NA Southampton 1.4613 Away
NA Draw NA NA
NA NA NA NA
Sunday, Jan 17th 2021 NA NA NA
6:00 AM Sheffield Utd 0.8965 Home
NA Tottenham 1.6542 Away
NA Draw NA NA
NA NA NA NA
Upvotes: 1
Views: 350
Reputation: 129
I leave you another solution with dplyr
. No need to use the for loop.
library(tidyr)
library(dplyr)
df <- tribble(
~`Date/Time` , ~Teams , ~xScore,
"Saturday, Jan 16th 2021", NA , NA,
"10:30 AM" , "Fulham" , 0.9125,
NA , "Chelsea" , 1.1634,
NA , "Draw" , NA,
NA , NA , NA,
"1:00 PM" , "Leicester" , 1.4562,
NA , "Southampton" , 1.4613,
NA , "Draw" , NA,
NA , NA , NA,
"Sunday, Jan 17th 2021" , NA , NA,
"6:00 AM" , "Sheffield Utd", 0.8965,
NA , "Tottenham" , 1.6542,
NA , "Draw" , NA,
NA , NA , NA
)
df %>%
mutate(HomeAway = case_when(
is.na(lag(xScore)) & !is.na(xScore) ~ "Home",
!is.na(xScore) & !is.na(lag(xScore)) ~ "Away",
TRUE ~ NA_character_
))
#> # A tibble: 14 x 4
#> `Date/Time` Teams xScore HomeAway
#> <chr> <chr> <dbl> <chr>
#> 1 Saturday, Jan 16th 2021 <NA> NA <NA>
#> 2 10:30 AM Fulham 0.912 Home
#> 3 <NA> Chelsea 1.16 Away
#> 4 <NA> Draw NA <NA>
#> 5 <NA> <NA> NA <NA>
#> 6 1:00 PM Leicester 1.46 Home
#> 7 <NA> Southampton 1.46 Away
#> 8 <NA> Draw NA <NA>
#> 9 <NA> <NA> NA <NA>
#> 10 Sunday, Jan 17th 2021 <NA> NA <NA>
#> 11 6:00 AM Sheffield Utd 0.896 Home
#> 12 <NA> Tottenham 1.65 Away
#> 13 <NA> Draw NA <NA>
#> 14 <NA> <NA> NA <NA>
Upvotes: 1
Reputation: 160437
ind <- !is.na(mytable$xScore)
ind <- which(ind[-length(ind)] & ind[-1])
mytable$HomeAway <- NA_character_
mytable$HomeAway[ind] <- "Home"
mytable$HomeAway[ind+1] <- "Away"
mytable
# Date.Time Teams xScore HomeAway
# 1 Saturday, Jan 16th 2021 <NA> <NA> <NA>
# 2 10:30 AM Fulham 0.9125 Home
# 3 <NA> Chelsea 1.1634 Away
# 4 <NA> Draw <NA> <NA>
# 5 <NA> <NA> <NA> <NA>
# 6 1:00 PM Leicester 1.4562 Home
# 7 <NA> Southampton 1.4613 Away
# 8 <NA> Draw <NA> <NA>
# 9 <NA> <NA> <NA> <NA>
# 10 Sunday, Jan 17th 2021 <NA> <NA> <NA>
# 11 6:00 AM Sheffield Utd 0.8965 Home
# 12 <NA> Tottenham 1.6542 Away
# 13 <NA> Draw <NA> <NA>
# 14 <NA> <NA> <NA> <NA>
Upvotes: 0
Reputation: 24079
Here is solution which avoids the for-loop.
The which
function finds the rows with an input and then alternates assigning home and away to those rows.
mytable <- structure(list(Date.Time = c("Saturday, Jan 16th 2021", "10:30 AM",
NA, NA, NA, "1:00 PM", NA, NA, NA, "Sunday, Jan 17th 2021", "6:00 AM",
NA, NA, NA), Teams = c(NA, "Fulham", "Chelsea", "Draw", NA, "Leicester",
"Southampton", "Draw", NA, NA, "Sheffield Utd", "Tottenham",
"Draw", NA), xScore = c(NA, 0.9125, 1.1634, NA, NA, 1.4562, 1.4613,
NA, NA, NA, 0.8965, 1.6542, NA, NA)), class = "data.frame", row.names = c(NA,
-14L))
mytable$HomeAway <- NA
#find rows with input in xScore column
filledrows <- which(!is.na(mytable$xScore) )
#Assgin home team to odd rows and visitors to even rowms
mytable$HomeAway[filledrows] <-ifelse(filledrows%%2==0, "Home", "Away")
mytable
Upvotes: 1
Reputation: 21937
The problem is that you need to initialize the HomeAway
variable first:
mytable <- tibble::tribble(
~`Date/Time`, ~Teams, ~xScore,
"Saturday, Jan 16th 2021", NA, NA,
"10:30 AM", "Fulham", 0.9125,
NA, "Chelsea", 1.1634,
NA, "Draw", NA,
NA, NA, NA,
"1:00 PM", "Leicester", 1.4562,
NA, "Southampton", 1.4613,
NA, "Draw", NA,
NA, NA, NA,
"Sunday, Jan 17th 2021", NA, NA,
"6:00 AM", "Sheffield Utd", 0.8965,
NA, "Tottenham", 1.6542,
NA, "Draw", NA,
NA, NA, NA
)
mytable$HomeAway <- NA
for (i in 1:length(mytable$xScore)){
if(is.na(mytable$xScore[i]) != TRUE){
if(is.na(mytable$xScore[i+1]) != TRUE){
mytable$HomeAway[i] <- "Home"
mytable$HomeAway[i+1] <- "Away"
}
}
}
mytable
# # A tibble: 14 x 4
# `Date/Time` Teams xScore HomeAway
# <chr> <chr> <dbl> <chr>
# 1 Saturday, Jan 16th 2021 NA NA NA
# 2 10:30 AM Fulham 0.912 Home
# 3 NA Chelsea 1.16 Away
# 4 NA Draw NA NA
# 5 NA NA NA NA
# 6 1:00 PM Leicester 1.46 Home
# 7 NA Southampton 1.46 Away
# 8 NA Draw NA NA
# 9 NA NA NA NA
# 10 Sunday, Jan 17th 2021 NA NA NA
# 11 6:00 AM Sheffield Utd 0.896 Home
# 12 NA Tottenham 1.65 Away
# 13 NA Draw NA NA
# 14 NA NA NA NA
Upvotes: 1