Reputation: 1314
I have data and a plot like the example I give below.
I'd like to have a third "Condition" that is the Total Sum of Amount for Condition A and Condition B for a given Year and Month. I don't know how to do that since Condition is included in the group_by
statement. In particular, I'd like to be able to plot it on the same plot as what appears below (so there'd be a third line for each Year showing the Total).
library(ggplot2)
library(dplyr)
data <- data.frame(Amount = sample(1:100, replace=T),
Condition = sample(c("A","B"), 100, replace=T),
Year = sample(2015:2017, 100, replace=T),
Month = sample(1:12, 100, replace=T))
dataGrouped <- data %>%
group_by(Year, Month, Condition) %>%
summarize(sumAmount = sum(Amount))
ggplot(dataGrouped, aes(x=Month, y=sumAmount, color=factor(Year), linetype=Condition)) +
geom_line(size=1) + scale_x_continuous(breaks = 1:12)
I've considered first doing a group_by(Year, Month)
, then adding a Total, but still not sure what way would be best to do this (or if there's a better alternative).
Upvotes: 3
Views: 7709
Reputation: 599
Here's a dplyr
solution that summarizes the total by Year and Month and then binds it to the grouped data with a Condition value of "Total", so that ggplot()
will pick it up as a new line in your plot.
library(ggplot2)
library(dplyr)
data <- data.frame(Amount = sample(1:100, replace=T),
Condition = sample(c("A","B"), 100, replace=T),
Year = sample(2015:2017, 100, replace=T),
Month = sample(1:12, 100, replace=T))
dataGrouped <- data %>%
group_by(Year, Month, Condition) %>%
summarize(sumAmount = sum(Amount))
ggplot(dataGrouped, aes(x=Month, y=sumAmount, color=factor(Year), linetype=Condition)) +
geom_line(size=1) + scale_x_continuous(breaks = 1:12)
dataWithTotal <- data %>%
group_by( Year, Month ) %>%
summarize( sumAmount = sum(Amount) ) %>%
mutate( Condition = 'Total' ) %>%
ungroup() %>%
rbind( ungroup(dataGrouped) ) %>%
mutate( Condition = as.factor(Condition) )
ggplot(dataWithTotal, aes(x=Month, y=sumAmount, color=factor(Year), linetype=Condition)) +
geom_line(size=1) + scale_x_continuous(breaks = 1:12)
Upvotes: 4
Reputation: 7153
Using reshape2
melt and dcast to reform the wide format for data manipulation (to form condition C):
library(reshape2)
data <- data %>%
mutate_at(vars(Condition, Year, Month), .funs= funs(as.factor))
dat <- melt(data) %>%
dcast(., Year + Month ~ Condition, sum)
dat <- dat %>%
mutate(C = A + B) %>%
mutate(Month = as.numeric(as.character(Month)))
Form long format with gather:
dat <- dat %>%
gather(Condition, Amount, A:C)
Plot:
ggplot(dat, aes(Month, Amount,color=factor(Year), linetype=Condition)) +
geom_line() + scale_x_continuous(breaks = 1:12)
Upvotes: 1