Reputation: 799
I am trying to create a multiple bar chart of my data, depicting the mean of avgct for each region with error bars using ggplot2.
Here is a sample of my data:
gregion lregion avgct
1 e 1.146
1 e 0.947
2 e 0.908
3 e 1.167
1 t 1.225
2 t 1.058
2 t 2.436
3 t 0.679
So far I have managed to create this graph, but it seems to be plotting the maximum values for avgct not the mean and therefore I cannot create error bars.
I think I need to calculate the mean of avgct by gregion and lregion so that I have an average value of avgct for each region, like this:
gregion lregion mean(avgct)
1 e 1.047
2 e 0.908
3 e 1.167
1 t 1.225
2 t 1.747
3 t 0.679
If anyone can help me with this so that I can plot a barchart of averages with error bars for my data it would be very much appreciated!
Upvotes: 2
Views: 1251
Reputation: 193517
This is a basic aggregation question, so the typical starting point should be aggregate
:
> aggregate(avgct ~ gregion + lregion, mydf, mean)
gregion lregion avgct
1 1 e 1.0465
2 2 e 0.9080
3 3 e 1.1670
4 1 t 1.2250
5 2 t 1.7470
6 3 t 0.6790
There are, however, several other alternatives, including "dplyr" and "data.table", that may be more appealing in the long run for convenience of syntax and overall efficiency.
library(data.table)
as.data.table(mydf)[, mean(avgct), by = .(gregion, lregion)]
library(dplyr)
mydf %>% group_by(gregion, lregion) %>% summarise(avgct = mean(avgct))
Upvotes: 1