Nido
Nido

Reputation: 55

how to show sum and mean with two different Y axis in same ggplot

I require a ggplot with two Y axis such that primary Y axis with the count (which i will calculate with function stat=count) and secondary Y axis showing mean calculated with stat_summary(fun.y="mean")

I calculated the count of clusters with:

library(ggplot2)
ggplot(df,aes(x=cluster,fill=year, stat="count"))+ 
  geom_bar()

and mean of divergence with in cluster with:

ggplot(df, aes(x=factor(cluster), y=divergence)) + stat_summary(fun.y="mean", geom="bar")

I have both these ggplots separately. I want to create a ggplot with both these functions such that count on Primary Y axis (geom_bar) and Mean as secondary Y axis (geom_point and geom_line). Anyone one help me with plotting. Your help will b highly appreciated

divergence  year    cluster
0.34    2015    A
0.89    2015    A
1.22    2015    A
1.11    2015    B
0.67    2015    B
0.89    2015    B
1.12    2015    B
0.4     2015    B
0.67    2015    B
0.89    2015    B
0.56    2015    B
1.22    2015    B
1.12    2015    B
0.4     2015    B
0.67    2016    A
0.89    2016    A
0.11    2016    B
1.33    2016    C
1.11    2016    C
1       2016    C
0.89    2016    C
1       2016    C
0.45    2016    C
0.23    2016    C
0.89    2016    C
0.8     2017    A
0.6     2017    A
1.11    2017    A
0.34    2017    B
0.78    2017    B
2.1     2017    C
0.89    2017    C
0.89    2017    C
0.34    2017    C
1.55    2017    A
1.11    2017    A
1.11    2017    A
1       2017    A
0.34    2017    A
0.67    2017    A
0.56    2017    B
1       2017    C
0.34    2017    C
0.23    2017    C
1       2017    C
1.33    2017    C
0.78    2017    C
1.11    2017    B
0.78    2017    C
1       2017    C
0.67    2017    C
0.67    2017    A
0.56    2017    A
1       2017    B
0.34    2017    C
0.67    2017    B
0       2017    B
0.67    2017    B
0.67    2017    B
0.34    2017    B
0.45    2017    B

Upvotes: 1

Views: 1483

Answers (1)

Chris
Chris

Reputation: 3996

It's not clear to me why you want a secondary axis when the scales on each axis will be identical but I have included and you can decide if you think it is appropriate.

I'll assume you're familiar with dplyr:

First I've created a new dataset with just the mean values for each cluster:

df_means <- df %>% group_by(cluster) %>% summarise(mean=mean(divergence)) 

Next I plotted the bar plot along with a line and point for the new dataset. The sec.axis argument in scale_y_continuous can be used to create a secondary axis with transformed scale (in this case 1:1)

ggplot()+
  geom_bar(data=df,aes(x=cluster,fill=year), stat="count") +
  geom_line(data=df_means,aes(x=cluster,y=mean, group=1,color='Average')) +
  geom_point(data=df_means,aes(x=cluster,y=mean,color='Average'))+
  scale_y_continuous("Count", sec.axis = sec_axis(trans =  ~. ,name = 
  'Average'))+
  theme(legend.title=element_blank())

enter image description here

UPDATE:

Thanks for uploading the additional data, I understand why you wanted the secondary axis now. And you can apply a transformation to the secondary axis using the trans parameter - in this case I just specified that the secondary axis is one-tenth the scale of the primary one.

You will then need to do the reverse transformation to the line and point you want displayed on the secondary axis - in this case multiplied by 10.

ggplot()+
  geom_bar(data=df,aes(x=cluster,fill=year), stat="count") +
  geom_line(data=df_means,aes(x=cluster,y=mean*10, group=1,color='Average')) +
  geom_point(data=df_means,aes(x=cluster,y=mean*10,color='Average'))+
  scale_y_continuous("Count", sec.axis = sec_axis(trans =  ~ ./10 ,name = 
                                                'Average'))+
  theme(legend.title=element_blank())

enter image description here

Upvotes: 2

Related Questions