Reputation: 191
I have some water quality (metals) results that are taken in June and December of each year. My current df has Month, Year, Detection. I would like to group by each test, ie June 2019, December 2019 and June 2020. I could create a new factor say Test with values of 0619, 1219, 0620. Also I could create a new factor from (Month Year)for each value.
Before that I was wondering if geom_boxplot
could combine factor of Month, Year to accomplish plotting the 3 unique tests. Grouping by Year or Month will not give me the 3 unique tests.
I am looking for a call syntax solution before the new factor route.
ggplot(data = Agm, aes(x = Month+Year, y = Level) , na.rm=TRUE) +
ggtitle("Lead Levels",subtitle=subtext )+
xlab("Test") + ylab("ppb") +
geom_boxplot( fill="red",width = 0.8) + theme_bw()
Upvotes: 0
Views: 771
Reputation: 797
If I understand correctly, you want to display a boxplot using two columns of factors (Month and Year).
There are a couple of ways you can accomplish this. Firstly, you can simply paste your columns together in within the ggplot
call, for example:
ggplot(data = Agm, aes(x = paste(Year, Month), y = Level)) +
geom_boxplot() + theme_bw()
In this situation though I usually create a new column and use that as the variable for the X axis. This will allow you more flexibility in managing the values and how they display. For example:
library(tidyverse)
# Create a new Date column, combining year and month, separated by a -
Agm <- Agm %>% mutate(Date = paste(Year, Month, sep = "-") %>% arrange(Date)
ggplot(data = Agm, aes(x = Date, y = Level)) +
geom_boxplot() + theme_bw()
Note, when using either method above I would suggest that you join based on the year first, and then the month as I have done, so that it doesn't order the data incorrectly on your plot. If you do month first, then January for all the years will be displayed first/left most, then February or October, depending if you have leading zeros or not.
Upvotes: 1