Reputation: 55
I require a ggplot with two Y axis such that primary Y axis with the count (which i will calculate with function stat=count
) and secondary Y axis showing mean calculated with stat_summary(fun.y="mean")
I calculated the count of clusters with:
library(ggplot2)
ggplot(df,aes(x=cluster,fill=year, stat="count"))+
geom_bar()
and mean of divergence with in cluster with:
ggplot(df, aes(x=factor(cluster), y=divergence)) + stat_summary(fun.y="mean", geom="bar")
I have both these ggplots separately. I want to create a ggplot with both these functions such that count
on Primary Y axis (geom_bar)
and Mean
as secondary Y axis (geom_point and geom_line)
. Anyone one help me with plotting.
Your help will b highly appreciated
divergence year cluster
0.34 2015 A
0.89 2015 A
1.22 2015 A
1.11 2015 B
0.67 2015 B
0.89 2015 B
1.12 2015 B
0.4 2015 B
0.67 2015 B
0.89 2015 B
0.56 2015 B
1.22 2015 B
1.12 2015 B
0.4 2015 B
0.67 2016 A
0.89 2016 A
0.11 2016 B
1.33 2016 C
1.11 2016 C
1 2016 C
0.89 2016 C
1 2016 C
0.45 2016 C
0.23 2016 C
0.89 2016 C
0.8 2017 A
0.6 2017 A
1.11 2017 A
0.34 2017 B
0.78 2017 B
2.1 2017 C
0.89 2017 C
0.89 2017 C
0.34 2017 C
1.55 2017 A
1.11 2017 A
1.11 2017 A
1 2017 A
0.34 2017 A
0.67 2017 A
0.56 2017 B
1 2017 C
0.34 2017 C
0.23 2017 C
1 2017 C
1.33 2017 C
0.78 2017 C
1.11 2017 B
0.78 2017 C
1 2017 C
0.67 2017 C
0.67 2017 A
0.56 2017 A
1 2017 B
0.34 2017 C
0.67 2017 B
0 2017 B
0.67 2017 B
0.67 2017 B
0.34 2017 B
0.45 2017 B
Upvotes: 1
Views: 1483
Reputation: 3996
It's not clear to me why you want a secondary axis when the scales on each axis will be identical but I have included and you can decide if you think it is appropriate.
I'll assume you're familiar with dplyr
:
First I've created a new dataset with just the mean values for each cluster:
df_means <- df %>% group_by(cluster) %>% summarise(mean=mean(divergence))
Next I plotted the bar plot along with a line and point for the new dataset. The sec.axis argument in scale_y_continuous
can be used to create a secondary axis with transformed scale (in this case 1:1)
ggplot()+
geom_bar(data=df,aes(x=cluster,fill=year), stat="count") +
geom_line(data=df_means,aes(x=cluster,y=mean, group=1,color='Average')) +
geom_point(data=df_means,aes(x=cluster,y=mean,color='Average'))+
scale_y_continuous("Count", sec.axis = sec_axis(trans = ~. ,name =
'Average'))+
theme(legend.title=element_blank())
UPDATE:
Thanks for uploading the additional data, I understand why you wanted the secondary axis now. And you can apply a transformation to the secondary axis using the trans
parameter - in this case I just specified that the secondary axis is one-tenth the scale of the primary one.
You will then need to do the reverse transformation to the line and point you want displayed on the secondary axis - in this case multiplied by 10.
ggplot()+
geom_bar(data=df,aes(x=cluster,fill=year), stat="count") +
geom_line(data=df_means,aes(x=cluster,y=mean*10, group=1,color='Average')) +
geom_point(data=df_means,aes(x=cluster,y=mean*10,color='Average'))+
scale_y_continuous("Count", sec.axis = sec_axis(trans = ~ ./10 ,name =
'Average'))+
theme(legend.title=element_blank())
Upvotes: 2