coding_heart
coding_heart

Reputation: 1295

How to aggregate variable by time period for time series plot in R using ggplot2

I would like to create a time series plot using ggplot2 in which the variable plotted over time. However, for each time period, I would like to plot the cumulative count for that period. For example:

set.seed(123)
frame <- data.frame(id = sort(rep(c(0:5), 5)),year = rep(c(2000:2005), 5), y = sample(0:1,30, replace = TRUE))
table(frame$year, frame$y)
ggplot(frame, aes(x = year, y = y)) + geom_point(shape = 1) # Not right

I would ultimately like it to generate a plot like this:

count<- table(frame$year, frame$y)[,2]
plot(2000:2005, count, type = "l")

enter image description here

I'm new to ggplot and any pointers would be greatly appreciated. Thanks.

Upvotes: 1

Views: 2430

Answers (2)

DatamineR
DatamineR

Reputation: 9628

Try:

library(ggplot2)
library(dplyr)
frame %>% group_by(year) %>% summarise(sum = sum(y)) %>% 
ggplot(aes(x = year, y = sum)) + geom_line()

enter image description here

Upvotes: 1

polka
polka

Reputation: 1529

You are essentially missing a single line from your program. You want a dataframe that returns the sum of the y variable for the year.

set.seed(123)
frame <- data.frame(id = sort(rep(c(0:5), 5)),year = rep(c(2000:2005), 5), y = sample(0:1,30, replace = TRUE))
table(frame$year, frame$y)
newFrame <-aggregate(frame$y, list(frame$year),sum)
ggplot(frame, aes(x = newFrame$Group.1, y = newFrame$x)) + geom_point(shape = 1) # Better

Upvotes: 1

Related Questions