Reputation: 1043
I have data with goals scored for each player each season:
playerID <- c(1,2,3,1,2,3,1,2,3,1,2,3)
year <- c(2002,2000,2000,2003,2001,2001,2000,2002,2002,2001,2003,2003)
goals <- c(25,21,27,31,39,34,42,44,46,59,55,53)
my_data <- data.frame(playerID, year, goals)
I would like to plot each player's cumulative number of goals over time:
ggplot(my_data, aes(x=year, y=cumsum_goals, group=playerID)) + geom_line()
I have tried using summarize
from dplyr
, but this only works if the data is already sorted by year
(see player 1):
new_data <- my_data %>%
group_by(playerID) %>%
mutate(cumsum_goals=cumsum(goals))
Is there a way to make this code robust to data where years are not in chronological order?
Upvotes: 3
Views: 133
Reputation: 389325
We can arrange
by playerID
and year
, take cumsum
and then plot
library(dplyr)
library(ggplot2)
my_data %>%
arrange(playerID, year) %>%
group_by(playerID) %>%
mutate(cumsum_goals=cumsum(goals)) %>%
ggplot() + aes(x=year, y= cumsum_goals, color = factor(playerID)) + geom_line()
Upvotes: 3