Reputation: 570
I have a CSV file of weights taken everyday for six months (August 2016 - January 2017) for every day. I would like to plot a boxplot for each month that basically plots the summary() of the data for each month. I would like to use ggplot2 for it, since it looks much prettier. I've fished around for a solution and come up with many but nothing that seems to solve what I want.
The head and summary of the data:
> wts <- read.csv('weights.csv', header=T, sep=',')
> head(wts)
August.2016 September.2016 October.2016 November.2016 December.2016 January.2016
1 254.2 250.0 248.2 245.8 245.6 244.4
2 252.6 249.2 248.6 246.4 246.0 245.0
3 251.8 250.6 249.2 248.0 246.4 244.3
4 253.2 252.4 249.8 247.5 246.0 243.6
5 252.2 250.6 248.8 247.0 246.0 242.6
6 254.0 251.0 247.8 247.6 246.0 242.0
> summary(wts)
August.2016 September.2016 October.2016 November.2016 December.2016 January.2016
Min. :249.6 Min. :245.6 Min. :245.4 Min. :244.2 Min. :243.4 Min. :241.6
1st Qu.:252.2 1st Qu.:248.3 1st Qu.:246.7 1st Qu.:246.2 1st Qu.:244.8 1st Qu.:242.9
Median :252.8 Median :249.2 Median :247.8 Median :246.6 Median :245.6 Median :243.6
Mean :252.7 Mean :249.1 Mean :247.6 Mean :246.7 Mean :245.3 Mean :243.5
3rd Qu.:253.6 3rd Qu.:250.0 3rd Qu.:248.2 3rd Qu.:247.2 3rd Qu.:246.0 3rd Qu.:244.3
Max. :255.2 Max. :252.4 Max. :249.8 Max. :248.6 Max. :247.0 Max. :245.0
NA's :1 NA's :1 NA's :1
From what I've gathered I need to reshape the data in way that ggplot likes, but I'm not sure how to do it. I would also, like highlight the mean (with the actual number) on the boxplot if it is possible. Could I get an idea on how to do it?
Thanks
Upvotes: 0
Views: 1442
Reputation: 24198
To stay in the same paradigm, you can use gather()
from tidyr
package to reshape your data into a long format, and plug the result into ggplot()
. To add text depicting the mean, you can use stat_summary()
with the "text"
geom and the mean
function applied to the value
variable.
library(tidyr)
library(ggplot2)
ggplot(gather(wts, factor_key = TRUE),
aes(key, value)) +
geom_boxplot() +
stat_summary(aes(label = ..y..),
fun.y = function(x) round(mean(x), 2),
geom = "text",
size = 3,
color = "red")
Upvotes: 2