cmirian
cmirian

Reputation: 2253

ggplot: how to add legend to a plot composed of several geom_ribbon() and geom_line()?

Question: how can I add a legend to this specific plot?

I have

enter image description here

The legend should include:

nd$y_fem - the blue line - should be in the legend as "5-yrs probability of death"

nd$y_tre - the red line - should be in the legend as "3-yrs probability of death"

nd$y_et - the green line - should be in the legend as "1-yr probability of death"

Preferably, the legend should include both the line and the fill.

How can this be done?

ggplot(nd, aes(x=n_fjernet))  +
  geom_ribbon(aes(ymin = y_tre, ymax = y_fem), alpha = .15, fill="#2C77BF") +
  geom_line(aes(y=y_fem), size=3, color="white") +  
  geom_line(aes(y=y_fem), color="#2C77BF", size=.85) + 

  geom_ribbon(aes(ymin = y_et, ymax = y_tre), alpha = .15, fill="#E38072") +     
  geom_line(aes(y=y_tre), size=3, color="white") + 
  geom_line(aes(y=y_tre), color="#E38072", size=.85) +

  geom_ribbon(aes(ymin = 0, ymax = y_et), alpha = .15, fill="#6DBCC3") + 
  geom_line(aes(y=y_et), size=3, color="white") +
  geom_line(aes(y=y_et), color="#6DBCC3",  size=.85) + 

  scale_x_continuous(breaks = seq(0,10,2), limits=c(0,10)) 

My data

nd <- structure(list(y_et = c(0.473, 0.473, 0.472, 0.471, 0.471, 0.47, 
0.47, 0.469, 0.468, 0.468, 0.467, 0.467, 0.466, 0.465, 0.465, 
0.464, 0.464, 0.463, 0.462, 0.462, 0.461, 0.461, 0.46, 0.459, 
0.459, 0.458, 0.458, 0.457, 0.456, 0.456, 0.455, 0.455, 0.454, 
0.453, 0.453, 0.452, 0.452, 0.451, 0.45, 0.45, 0.449, 0.449, 
0.448, 0.447, 0.447, 0.446, 0.446, 0.445, 0.445, 0.444, 0.443, 
0.443, 0.442, 0.442, 0.441, 0.44, 0.44, 0.439, 0.439, 0.438, 
0.438, 0.437, 0.436, 0.436, 0.435, 0.435, 0.434, 0.433, 0.433, 
0.432, 0.432, 0.431, 0.431, 0.43, 0.429, 0.429, 0.428, 0.428, 
0.427, 0.427, 0.426, 0.425, 0.425, 0.424, 0.424, 0.423, 0.423, 
0.422, 0.421, 0.421, 0.42, 0.42, 0.419, 0.419, 0.418, 0.417, 
0.417, 0.416, 0.416, 0.415), y_tre = c(0.895, 0.894, 0.894, 0.893, 
0.893, 0.893, 0.892, 0.892, 0.891, 0.891, 0.89, 0.89, 0.889, 
0.889, 0.889, 0.888, 0.888, 0.887, 0.887, 0.886, 0.886, 0.886, 
0.885, 0.885, 0.884, 0.884, 0.883, 0.883, 0.882, 0.882, 0.881, 
0.881, 0.881, 0.88, 0.88, 0.879, 0.879, 0.878, 0.878, 0.877, 
0.877, 0.876, 0.876, 0.875, 0.875, 0.875, 0.874, 0.874, 0.873, 
0.873, 0.872, 0.872, 0.871, 0.871, 0.87, 0.87, 0.869, 0.869, 
0.868, 0.868, 0.867, 0.867, 0.866, 0.866, 0.865, 0.865, 0.865, 
0.864, 0.864, 0.863, 0.863, 0.862, 0.862, 0.861, 0.861, 0.86, 
0.86, 0.859, 0.859, 0.858, 0.858, 0.857, 0.857, 0.856, 0.856, 
0.855, 0.855, 0.854, 0.854, 0.853, 0.853, 0.852, 0.852, 0.851, 
0.851, 0.85, 0.85, 0.849, 0.848, 0.848), y_fem = c(0.974, 0.974, 
0.973, 0.973, 0.973, 0.973, 0.973, 0.973, 0.972, 0.972, 0.972, 
0.972, 0.972, 0.971, 0.971, 0.971, 0.971, 0.971, 0.971, 0.97, 
0.97, 0.97, 0.97, 0.97, 0.969, 0.969, 0.969, 0.969, 0.969, 0.968, 
0.968, 0.968, 0.968, 0.968, 0.967, 0.967, 0.967, 0.967, 0.967, 
0.966, 0.966, 0.966, 0.966, 0.966, 0.965, 0.965, 0.965, 0.965, 
0.965, 0.964, 0.964, 0.964, 0.964, 0.963, 0.963, 0.963, 0.963, 
0.963, 0.962, 0.962, 0.962, 0.962, 0.961, 0.961, 0.961, 0.961, 
0.961, 0.96, 0.96, 0.96, 0.96, 0.959, 0.959, 0.959, 0.959, 0.958, 
0.958, 0.958, 0.958, 0.957, 0.957, 0.957, 0.957, 0.957, 0.956, 
0.956, 0.956, 0.956, 0.955, 0.955, 0.955, 0.955, 0.954, 0.954, 
0.954, 0.954, 0.953, 0.953, 0.953, 0.952), n_fjernet = c(0, 0.1, 
0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 1.1, 1.2, 1.3, 1.4, 
1.5, 1.6, 1.7, 1.8, 1.9, 2, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 
2.8, 2.9, 3, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 4, 
4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8, 4.9, 5, 5.1, 5.2, 5.3, 
5.4, 5.5, 5.6, 5.7, 5.8, 5.9, 6, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 
6.7, 6.8, 6.9, 7, 7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 7.7, 7.8, 7.9, 
8, 8.1, 8.2, 8.3, 8.4, 8.5, 8.6, 8.7, 8.8, 8.9, 9, 9.1, 9.2, 
9.3, 9.4, 9.5, 9.6, 9.7, 9.8, 9.9)), row.names = c(NA, -100L), class = c("data.table", 
"data.frame"))

Upvotes: 1

Views: 283

Answers (1)

Allan Cameron
Allan Cameron

Reputation: 174556

In ggplot the legend is generated from a scale that you set to identify groupings or values within your data. You have plotted each ribbon and line separately, so there is no scale tying them all together from which a legend can be automatically generated.

I can see why you have done it this way - your variables are all in different columns instead of being variables in a single column. This is one of the occasions when it is best to transform your data into long format for the purposes of plotting, using pivot_longer from tidyr.

To simplify the data wrangling, instead of using ribbons, you can use a stacked area plot. This requires you to modify the input data, which we can do easily with mutate:

library(dplyr)
library(tidyr)
library(ggplot2) 

my_labels <- c("5 year probability of death",
               "3 year probability of death",
               "1 year probability of death")

df <- mutate(nd, y_fem = y_fem - y_tre, y_tre = y_tre - y_et) %>%
  tidyr::pivot_longer(1:3) %>% 
  mutate(name = factor(name, levels = c("y_fem", "y_tre", "y_et")))

  ggplot(df, aes(x=n_fjernet, y = value, colour = name, group = name))  +
  geom_area(aes(fill = name), position = "stack", alpha = 0.15) +
  geom_line(colour = "white", size = 3, position = "stack") +
  geom_line(position = "stack") +
  geom_point(position = "stack", data = df[c(1:3, -2:0 + nrow(df)), ]) +
  scale_fill_manual(values = c("#2C77BF", "#E38072", "#6DBCC3"),
                    labels = my_labels) +
  scale_colour_manual(values = c("#2C77BF", "#E38072", "#6DBCC3"),
                      labels = my_labels)

enter image description here

Note that the plot is different from the example one: the supplied data is only for the leftmost part of the plot.

Upvotes: 1

Related Questions