Ellen Wang
Ellen Wang

Reputation: 33

How to add legend to ggplot with dual y-axis in R>

I'm trying to find a way to add legends to my ggplot figure. I have tried multiple ways but none of them will let the legend appear. Here is the dataset I am using. Below is the code I wrote so far.

library(dplyr)
library(tidyr)
library(ggplot2)
library(lubridate)
library(ggthemes)

stock %>%
  drop_na(disease, US_Stock, China_Stock) %>%
  filter(disease == "Ebola") %>%
  ggplot(aes(x = date)) + 
    geom_line(aes(y = US_Stock), size = 0.5, col = "dark green") +
    geom_line(aes(y = China_Stock/5+1500), size = 0.5, col = "red") +
    scale_x_date(name = "Date") +
    scale_y_continuous(name = "US Stock Market (S&P 500)", 
                       sec.axis = sec_axis(~(.-1500)*5, name = "China Stock Market (CSI300)")) +
    labs(title = "Figure 3: Stock Market under Ebola", caption = "Death Rate: 50%-90%")  +
    theme_stata() 

Here are several observations in my data.

Upvotes: 3

Views: 310

Answers (2)

Edward
Edward

Reputation: 18598

I prefer to reshape all stocks into one column as this facilitates legends and other aesthetics. I've also used the mean to determine the ratio, which makes the graph a little more interesting. (But I don't like dual axes).

library(dplyr)
library(tidyr)
library(ggplot2)
library(lubridate)
library(ggthemes)
ratio = mean(stock$China_Stock)/mean(stock$US_Stock)

stock %>%
  mutate(China_Stock=China_Stock / ratio) %>%
  pivot_longer(cols=ends_with("Stock"), names_pattern="(.+)_(Stock)",
               names_to = c("Country", ".value")) %>%
  ggplot(aes(x = date, y=Stock, col=Country)) + 
  geom_line(size = 0.5) +
  scale_color_manual(values=c("darkgreen","red")) +
  labs(title = "Figure 3: Stock Market under Ebola", caption = "Death Rate: 50%-90%") +
  scale_y_continuous(name = "US Stock Market (S&P 500)", 
                       sec.axis = sec_axis(~.*ratio, name = "China Stock Market (CSI300)")) +
  theme_minimal() 

enter image description here


Data:

stock <- structure(list(ID = 131:140, date = structure(c(11878, 11879, 
                                                        11880, 11883, 11884, 11885, 11886, 11887, 11890, 11891), class = "Date"), 
                       US_Stock = c(920.47, 927.37, 921.39, 917.93, 901.05, 906.04, 
                                    881.56, 847.76, 819.85, 797.7), China_Stock = c(1392.9, 1390.93, 
                                                                                    1389.45, 1379.38, 1382.41, 1393.02, 1397.41, 1403.25, 1380.28, 
                                                                                    1375.61), disease = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
                                                                                                                    1L, 1L, 1L), .Label = "SARS", class = "factor")), class = "data.frame", row.names = c(NA, 
                                                                                                                                                                                                          -10L))

Upvotes: 2

Ian Campbell
Ian Campbell

Reputation: 24790

You were pretty close. I typically set the ratio ahead of time.

library(dplyr)
library(ggplot2)
library(ggthemes)
ratio = max(data$China_Stock)/max(data$US_Stock)
stock %>% ggplot(aes(x=date)) +
  geom_line(aes(y = US_Stock * ratio), size = 0.5, col = "dark green") +
  geom_line(aes(y = China_Stock), size = 0.5, col = "red") +
  scale_x_date(name = "Date") +
  scale_y_continuous(name = "China Stock Market (CSI300)",
                     sec.axis = sec_axis(~ . / ratio, name = "US Stock Market (S&P 500)")) +
  labs(title = "Figure 3: Stock Market under Ebola", caption = "Death Rate: 50%-90%") +
    theme_stata()

enter image description here

Data

stock <- structure(list(ID = 131:140, date = structure(c(11878, 11879, 
11880, 11883, 11884, 11885, 11886, 11887, 11890, 11891), class = "Date"), 
    US_Stock = c(920.47, 927.37, 921.39, 917.93, 901.05, 906.04, 
    881.56, 847.76, 819.85, 797.7), China_Stock = c(1392.9, 1390.93, 
    1389.45, 1379.38, 1382.41, 1393.02, 1397.41, 1403.25, 1380.28, 
    1375.61), disease = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L), .Label = "SARS", class = "factor")), class = "data.frame", row.names = c(NA, 
-10L))

Upvotes: 1

Related Questions