Reputation: 2834
I have the following data:
set.seed(123)
timeseq <- as.Date(Sys.time() + cumsum(runif(1000)*86400))
data <- rnorm(1000)
df <- data.frame(timeseq,data)
I wanted to know if anyone has any methods on how to aggregate data
by week. What I am attempting to do is plot a time series ggplot, so even better if I can skip this step and have ggplot handle this. Been stuck on this all day.
Upvotes: 6
Views: 11311
Reputation: 4298
I want to expand upon @chappers idea of using package lubridate
, but in a fully piped way.
library(dplyr)
library(ggplot2)
library(lubridate)
set.seed(123)
data.frame(
timeseq = as.Date(Sys.time() + cumsum(runif(1000) * 86400)),
data = rnorm(1000)
) %>%
mutate(timeseq = floor_date(timeseq, unit = "week")) %>%
group_by(timeseq) %>%
summarise(data = sum(data)) %>%
ggplot() +
geom_line(aes(x = timeseq, y = data))
Substitute data.frame
lines with df
if you have it already stored as an object.
Upvotes: 1
Reputation: 51
You can also aggregate a date aesthetic with the scale_x_date()
function's breaks
argument.
ggplot(df, aes(x = timeseq, y = data)) +
stat_summary(fun.y = sum, geom = "line") +
scale_x_date(labels = date_format("%Y-%m-%d"),
breaks = "1 week")
Upvotes: 3
Reputation: 32426
Another way to manually aggregate by week using dplyr.
library(dplyr)
df$weeks <- cut(df[,"timeseq"], breaks="week")
agg <- df %>% group_by(weeks) %>% summarise(agg=sum(data))
ggplot(agg, aes(as.Date(weeks), agg)) + geom_point() + scale_x_date() +
ylab("Aggregated by Week") + xlab("Week") + geom_line()
Upvotes: 7
Reputation: 2415
To get the week we can use the lubridate
library, with the floor_date
function like so:
library(lubridate)
df$week <- floor_date(df$timeseq, "week")
We can plot the data using ggplot
by doing a stats summary (there might be a better way?), and it will look like this:
stat_sum_single <- function(fun, geom="point", ...) {
stat_summary(fun.y=fun, colour="red", geom=geom, size = 3, ...)
}
ggplot(df, aes(x=floor_date(timeseq, "week"), y=data)) +
stat_sum_single(sum, geom="line") +
xlab("week")
which will have the output:
Upvotes: 0