Giuseppe Petri
Giuseppe Petri

Reputation: 640

How to add a secondary y-axis for a variable in another scale in ggplot2 R?

I need to plot, in the same figure, two variables that have the same x-scale (but not the exact same values) but different y-scale.

This is a mock data set.

data.to.plot <-read.csv(text = "
day,y_var,variable
27,0.507443942,A
41,2.527504878,A
54,6.751827205,A
68,7.043454632,A
85,5.768129102,A
86,5.402048401,A
97,5.898675564,A
99,5.748121277,A
114,4.720510161,A
127,0.262912624,A
27,0.016378515,B
36,0.186698659,B
41,0.408702584,B
49,0.612277527,B
54,0.621408327,B
56,0.666804636,B
69,0.820265225,B
77,0.773412558,B
84,0.859296621,B
91,0.735260116,B
98,0.722547242,B
105,0.902835074,B
114,0.637068452,B
127,0.187549491,B
")

And this is the code I have for the plot I am able to make.

ggplot(data.to.plot, aes(x=day, y=y_var, col=variable)) +
        geom_point(aes(x=day, y=y_var, col=variable)) +
        geom_smooth(aes(x=day, y=y_var, col=variable),
                    method = loess, se = FALSE)

enter image description here

What I would need is to add a secondary y-axis for variable B to expand the scale to better visualize the data. 
The secondary y-axis should go from 0 to 1.
I played around with "sec.axis" function but I was not able to find a solution. 
Any hint would be really appreciated. 

Upvotes: 1

Views: 1512

Answers (1)

NotThatKindODr
NotThatKindODr

Reputation: 719

So you need to split the variable into two, you do it by creating two different dataframes or using the spread function in dplyr. For simplicity I did the first one. This is because you need to rescale the data set you want on the second axis and then scale the axis back down, note the *7.5 and the ~./7.5.

 library(tidyverse)
 data.to.plot.A <- data.to.plot %>% 
                    filter(variable == "A")

 data.to.plot.B <- data.to.plot %>% 
                filter(variable == "B")

  ggplot() +
      geom_point(data.to.plot.A, mapping = aes(x=day, y=y_var), color = "red") +
      geom_smooth(data.to.plot.A, mapping = aes(x=day, y=y_var), color = "red",
                  method = loess, se = FALSE) +
      geom_point(data.to.plot.B, mapping = aes(x=day, y=y_var*7.5), color = "blue") +
      geom_smooth(data.to.plot.B, mapping = aes(x=day, y=y_var*7.5), color = "blue",
                  method = loess, se = FALSE) +
      scale_y_continuous(sec.axis = sec_axis(~./7.5, name = "y_var_b"))

Here is the spread solution, I always forget they're phasing out spread in favor of pivot_wider. Same idea though. What you are doing here is taking the variable column and turning each variable into its own column. This way you don't need to make to objects for each. This creates NA since you don't have data for all days on both variables. Note the y aes changes to the new column name

library(tidyverse)
data.to.plot.pivot <- data.to.plot %>% 
    pivot_wider( names_from = variable, values_from = y_var)

     ggplot(data.to.plot.pivot) +
            geom_point(mapping = aes(x=day, y=A), color = "red") +
            geom_smooth(mapping = aes(x=day, y=A), color = "red",
                        method = loess, se = FALSE) +
            geom_point(mapping = aes(x=day, y=B*7.5), color = "blue") +
            geom_smooth(mapping = aes(x=day, y=B*7.5), 
                       color = "blue",
                        method = loess, se = FALSE) +
           scale_y_continuous(sec.axis = sec_axis(~./7.5, name = "y_var_b"))

Upvotes: 2

Related Questions