How does one evaluate changes in a categorical variable over time using R?

Question

I have a dataset where two teams square off in an annual game. There are two divisions where these games take place, East and West. I want to determine who the reigning champion is for a given year based on results from a previous year's game. I'd like to do this for both divisions.

Here's my dataset:

data <- data.frame(
  Team = c("Hot Dogs", "Hamburgers", "Hot Dogs", "Hamburgers", "Hot Dogs",
           "Hamburgers", "Pho", "Ramen", "Pho", "Ramen", "Pho", "Ramen"),
  Division = c("West", "West", "West", "West", "West", "West", "East", "East",
               "East", "East", "East", "East"),
  Year = c("2017", "2017", "2018", "2018", "2019", "2019", "2017", "2017",
           "2018", "2018", "2019", "2019"),
  Score = c("37", "2", "26", "32", "37", "9", "22", "31", "25", "32", "24", "18"))

Ideally I would add a "Results" column to the original data to indicate whether or not the given team is the reigning champion going into that game. Something like this:

data$Result <- c("Initial Champion", "NA", "Champion", "NA", "NA", "Champion", "NA", 
"Initial Champion", "NA", "Champion", "NA", "Champion")

Is there a straightforward way to do this using R, specifically using the tidyverse library if possible?

I appreciate any advice. Thanks in advance.

StupidWolf · Accepted Answer

First we get a table that has all the champions and label them as "Initial Champion" if it is the first, and others as "Champion":

library(dplyr)
X = data %>% 
arrange(Year,desc(Score)) %>% 
group_by(Division) %>% 
filter(!duplicated(Year))%>% 
mutate(result=rep(c("Initial Champion","Champion"),times=c(1,n()-1)))

# A tibble: 6 x 5
# Groups:   Division [2]
  Team       Division Year  Score result          
                         
1 Hot Dogs   West     2017  37    Initial Champion
2 Ramen      East     2017  31    Initial Champion
3 Hamburgers West     2018  32    Champion        
4 Ramen      East     2018  32    Champion        
5 Hamburgers West     2019  9     Champion        
6 Pho        East     2019  24    Champion

To get your final table just do:

left_join(data,X)

How does one evaluate changes in a categorical variable over time using R?

Answers (2)

Related Questions