Reputation: 69
I am very new to R, but am trying to create vertical mean lines in a barplot from the ggplot2 library.
My current graph is grouped by year, with each colored bar representing a different age. The y axis is count: mygraph
I would like to have a vertical mean line for each group by year (i.e., a mean line of age for year 1, a mean line of age for year 2, etc.)
This is what I have written thus far:
library(ggplot2)
AgeDat<-read.csv(File Name)
columnmeans<-c("MeanValue", "MeanValue", "MeanValue"..."MeanValue")
SepAgePlot<-ggplot(AgeDat, aes(factor(Year), Count, fill=(factor(Age))))+
geom_bar(stat="identity", position="dodge")+
SepAgePlot+(geom_vline(data=AgeDat, xintercept = columnmeans)+
theme_classic()
SepAgePlot+labs (x="Year", y="count", title="my graph")
Thank you in advance for all your help!
UDPATE: This would be a sample data set:
Year Age Count
1964 31 15
1964 33 23
1964 34 54
1964 35 8
1964 36 44
1964 37 21
1964 38 23
1964 39 26
1964 40 23
1965 30 22
1965 31 23
1965 32 45
1965 33 55
1965 34 23
1965 35 10
1965 36 12
1965 37 16
1965 38 32
1965 39 36
1965 40 13
1966 30 27
1966 31 32
1966 32 19
1966 33 45
1966 34 35
1966 35 60
1966 36 15
1966 37 28
1966 38 56
1966 39 18
1966 40 25
1967 30 36
1967 31 32
1967 32 23
1967 33 9
1967 34 15
1967 35 0
1967 36 5
1967 37 7
1967 38 24
1967 39 31
1967 40 24
The mean age of each year is:
1964 35.4
1965 35.6
1966 35.0
1967 34.6
My goal is to create a barplot of the data grouped by year, with a line representing the mean age for each year.
Upvotes: 1
Views: 1568
Reputation: 657
You can use the group_by
and summarise
functions from the dplyr package to add a column with the mean age for each year:
AgeDat <- AgeDat %>% group_by(Year) %>% summarise(MeanValue=sum(Age*Count)/sum(Count)) %>% inner_join(AgeDat, .)
AgeDat
Year Age Count MeanValue
1 1964 30 44 34.64545
2 1964 31 44 34.64545
3 1964 32 46 34.64545
4 1964 33 35 34.64545
5 1964 34 83 34.64545
6 1964 35 70 34.64545
7 1964 36 73 34.64545
8 1964 37 85 34.64545
9 1964 38 31 34.64545
10 1964 39 39 34.64545
11 1965 30 100 34.65485
12 1965 31 19 34.65485
You can then use geom_vline
to add a vertical line to each year, and facet_wrap
to get the age distribution by year:
ggplot(AgeDat) +
geom_bar(aes(x=Age, y=Count), stat="identity", position="dodge") +
geom_vline(aes(xintercept=MeanValue)) +
facet_wrap(~Year, nrow=1) +
theme_classic()
Upvotes: 2