Reputation: 21
I have a data that looks like this.
head(histogram)
year month day create verified trans
1 2015 12 10 2 2 2
2 2015 12 14 3 1 NA
3 2016 1 6 1 NA NA
4 2016 1 15 1 NA NA
5 2016 1 17 1 1 NA
6 2016 1 25 1 NA NA
Year, Month, day are in separate columns. I wish to plot a bar graph grouping by week.
For example, datas from 2016-1-1 to 2016-1-6 will be grouped in x axis to produce 3 bars: sum of all create that corresponds to create, verified, trans respective. I would prefer using ggplot2 but anything would be fine.
Upvotes: 0
Views: 1477
Reputation: 3948
I recommend using POSIX
format when you want to work with time series and ggplot2
.
Note that you have to handle to week 00, which are the first days of January ending the 52th week of December.
## Fake data / without a reproducible example
set.seed(1)
df = data.frame(year = c(rep(2015,14), rep(2016,21)),
month = c(rep(12,14), rep(01,21)), day = c(seq(18,31,1), seq(01,21,1)),
create = sample(c(1,2,3,NA),35, replace = T, prob = c(0.3,0.3,0.3,0.1)),
verified = sample(c(1,2,3,NA),35, replace = T, prob = c(0.1,0.1,0.1,0.7)),
trans = sample(c(1,2,3,NA),35, replace = T, prob = c(0.1,0.2,0.1,0.6)))
# Add of week information
df$date_posix = as.POSIXct(paste0(df$year, "-", df$month, "-", df$day))
df$week = strftime(df$date_posix ,format = "%W")
# summarize
require(plyr)
#> Le chargement a nécessité le package : plyr
df_sum = ddply(df, "week", summarize,
create_sum = sum(create, na.rm = T),
verified_sum = sum(verified, na.rm = T),
trans_sum = sum(trans, na.rm = T))
# melt
require(reshape2)
#> Le chargement a nécessité le package : reshape2
df_sum_melt = melt(df_sum, id = "week")
# plot
require(ggplot2)
#> Le chargement a nécessité le package : ggplot2
ggplot(df_sum_melt, aes(x = week, y = value, fill = variable)) +
geom_bar(stat = "identity", position = "dodge")
Created on 2018-09-18 by the reprex package (v0.2.0).
EDIT (the tidyverse way)
library(tidyverse)
library(lubridate)
#>
#> Attachement du package : 'lubridate'
#> The following object is masked from 'package:base':
#>
#> date
set.seed(1)
tibble(year = c(rep(2015,14), rep(2016,21)),
month = c(rep(12,14), rep(01,21)), day = c(seq(18,31,1), seq(01,21,1)),
create = sample(c(1,2,3,NA),35, replace = T, prob = c(0.3,0.3,0.3,0.1)),
verified = sample(c(1,2,3,NA),35, replace = T, prob = c(0.1,0.1,0.1,0.7)),
trans = sample(c(1,2,3,NA),35, replace = T, prob = c(0.1,0.2,0.1,0.6))) %>%
mutate(date_posix = as.Date(paste0(year, "-", month, "-", day)),
week = lubridate::week(date_posix)) %>%
group_by(week) %>%
summarise(create_sum = sum(create, na.rm = T),
verified_sum = sum(verified, na.rm = T),
trans_sum = sum(trans, na.rm = T)) %>%
gather(variable, value, -week) %>%
ggplot(., aes(x = factor(week), y = value, fill = variable)) +
geom_bar(stat = "identity", position = "dodge")
Created on 2018-09-18 by the reprex package (v0.2.0).
Upvotes: 4