steve
steve

Reputation: 21

How to plot time series data with ggplot2 in R

I have a data that looks like this.

head(histogram)
  year month day create verified trans
1 2015    12  10      2        2     2
2 2015    12  14      3        1    NA
3 2016     1   6      1       NA    NA
4 2016     1  15      1       NA    NA
5 2016     1  17      1        1    NA
6 2016     1  25      1       NA    NA

Year, Month, day are in separate columns. I wish to plot a bar graph grouping by week.

For example, datas from 2016-1-1 to 2016-1-6 will be grouped in x axis to produce 3 bars: sum of all create that corresponds to create, verified, trans respective. I would prefer using ggplot2 but anything would be fine.

Upvotes: 0

Views: 1477

Answers (1)

bVa
bVa

Reputation: 3948

I recommend using POSIX format when you want to work with time series and ggplot2.

Note that you have to handle to week 00, which are the first days of January ending the 52th week of December.

## Fake data / without a reproducible example
set.seed(1)
df = data.frame(year = c(rep(2015,14), rep(2016,21)), 
                month = c(rep(12,14), rep(01,21)), day = c(seq(18,31,1), seq(01,21,1)), 
                create = sample(c(1,2,3,NA),35, replace = T, prob = c(0.3,0.3,0.3,0.1)), 
                verified = sample(c(1,2,3,NA),35, replace = T, prob = c(0.1,0.1,0.1,0.7)), 
                trans = sample(c(1,2,3,NA),35, replace = T, prob = c(0.1,0.2,0.1,0.6)))

# Add of week information
df$date_posix = as.POSIXct(paste0(df$year, "-", df$month, "-", df$day))
df$week = strftime(df$date_posix ,format = "%W") 

# summarize
require(plyr)
#> Le chargement a nécessité le package : plyr
df_sum = ddply(df, "week", summarize, 
create_sum = sum(create, na.rm = T), 
verified_sum = sum(verified, na.rm = T), 
trans_sum = sum(trans, na.rm = T))

# melt
require(reshape2)
#> Le chargement a nécessité le package : reshape2
df_sum_melt = melt(df_sum, id = "week")

# plot
require(ggplot2)
#> Le chargement a nécessité le package : ggplot2
ggplot(df_sum_melt, aes(x = week, y = value, fill = variable)) + 
geom_bar(stat = "identity", position = "dodge")

Created on 2018-09-18 by the reprex package (v0.2.0).

EDIT (the tidyverse way)

library(tidyverse)
library(lubridate)
#> 
#> Attachement du package : 'lubridate'
#> The following object is masked from 'package:base':
#> 
#>     date
set.seed(1)
tibble(year = c(rep(2015,14), rep(2016,21)), 
       month = c(rep(12,14), rep(01,21)), day = c(seq(18,31,1), seq(01,21,1)), 
       create =     sample(c(1,2,3,NA),35, replace = T, prob = c(0.3,0.3,0.3,0.1)), 
       verified  = sample(c(1,2,3,NA),35, replace = T, prob = c(0.1,0.1,0.1,0.7)), 
       trans  = sample(c(1,2,3,NA),35, replace = T, prob = c(0.1,0.2,0.1,0.6))) %>%
  mutate(date_posix = as.Date(paste0(year, "-", month, "-", day)),
         week = lubridate::week(date_posix)) %>%
  group_by(week) %>%
  summarise(create_sum = sum(create, na.rm = T), 
            verified_sum = sum(verified, na.rm = T), 
            trans_sum = sum(trans, na.rm = T)) %>%
  gather(variable, value, -week) %>%
  ggplot(., aes(x = factor(week), y = value, fill = variable)) + 
  geom_bar(stat = "identity", position = "dodge")

Created on 2018-09-18 by the reprex package (v0.2.0).

Upvotes: 4

Related Questions