James Martherus
James Martherus

Reputation: 1043

Getting Cumulative Sum Over Time

I have data with goals scored for each player each season:

playerID <- c(1,2,3,1,2,3,1,2,3,1,2,3)
year <- c(2002,2000,2000,2003,2001,2001,2000,2002,2002,2001,2003,2003)
goals <- c(25,21,27,31,39,34,42,44,46,59,55,53)
my_data <- data.frame(playerID, year, goals)

I would like to plot each player's cumulative number of goals over time:

ggplot(my_data, aes(x=year, y=cumsum_goals, group=playerID)) + geom_line()

I have tried using summarize from dplyr, but this only works if the data is already sorted by year (see player 1):

new_data <- my_data %>%
  group_by(playerID) %>%
  mutate(cumsum_goals=cumsum(goals))

Is there a way to make this code robust to data where years are not in chronological order?

Upvotes: 3

Views: 133

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389325

We can arrange by playerID and year, take cumsum and then plot

library(dplyr)
library(ggplot2)

my_data %>%
  arrange(playerID, year) %>%
  group_by(playerID) %>%
  mutate(cumsum_goals=cumsum(goals)) %>%
  ggplot() + aes(x=year, y= cumsum_goals, color = factor(playerID)) + geom_line()

enter image description here

Upvotes: 3

Related Questions