Reputation: 2253
Question: how can I add a legend to this specific plot?
I have
The legend should include:
nd$y_fem
- the blue line - should be in the legend
as "5-yrs probability of death"
nd$y_tre
- the red line - should be in the legend
as "3-yrs probability of death"
nd$y_et
- the green line - should be in the legend
as "1-yr probability of death"
Preferably, the legend
should include both the line
and the fill
.
How can this be done?
ggplot(nd, aes(x=n_fjernet)) +
geom_ribbon(aes(ymin = y_tre, ymax = y_fem), alpha = .15, fill="#2C77BF") +
geom_line(aes(y=y_fem), size=3, color="white") +
geom_line(aes(y=y_fem), color="#2C77BF", size=.85) +
geom_ribbon(aes(ymin = y_et, ymax = y_tre), alpha = .15, fill="#E38072") +
geom_line(aes(y=y_tre), size=3, color="white") +
geom_line(aes(y=y_tre), color="#E38072", size=.85) +
geom_ribbon(aes(ymin = 0, ymax = y_et), alpha = .15, fill="#6DBCC3") +
geom_line(aes(y=y_et), size=3, color="white") +
geom_line(aes(y=y_et), color="#6DBCC3", size=.85) +
scale_x_continuous(breaks = seq(0,10,2), limits=c(0,10))
My data
nd <- structure(list(y_et = c(0.473, 0.473, 0.472, 0.471, 0.471, 0.47,
0.47, 0.469, 0.468, 0.468, 0.467, 0.467, 0.466, 0.465, 0.465,
0.464, 0.464, 0.463, 0.462, 0.462, 0.461, 0.461, 0.46, 0.459,
0.459, 0.458, 0.458, 0.457, 0.456, 0.456, 0.455, 0.455, 0.454,
0.453, 0.453, 0.452, 0.452, 0.451, 0.45, 0.45, 0.449, 0.449,
0.448, 0.447, 0.447, 0.446, 0.446, 0.445, 0.445, 0.444, 0.443,
0.443, 0.442, 0.442, 0.441, 0.44, 0.44, 0.439, 0.439, 0.438,
0.438, 0.437, 0.436, 0.436, 0.435, 0.435, 0.434, 0.433, 0.433,
0.432, 0.432, 0.431, 0.431, 0.43, 0.429, 0.429, 0.428, 0.428,
0.427, 0.427, 0.426, 0.425, 0.425, 0.424, 0.424, 0.423, 0.423,
0.422, 0.421, 0.421, 0.42, 0.42, 0.419, 0.419, 0.418, 0.417,
0.417, 0.416, 0.416, 0.415), y_tre = c(0.895, 0.894, 0.894, 0.893,
0.893, 0.893, 0.892, 0.892, 0.891, 0.891, 0.89, 0.89, 0.889,
0.889, 0.889, 0.888, 0.888, 0.887, 0.887, 0.886, 0.886, 0.886,
0.885, 0.885, 0.884, 0.884, 0.883, 0.883, 0.882, 0.882, 0.881,
0.881, 0.881, 0.88, 0.88, 0.879, 0.879, 0.878, 0.878, 0.877,
0.877, 0.876, 0.876, 0.875, 0.875, 0.875, 0.874, 0.874, 0.873,
0.873, 0.872, 0.872, 0.871, 0.871, 0.87, 0.87, 0.869, 0.869,
0.868, 0.868, 0.867, 0.867, 0.866, 0.866, 0.865, 0.865, 0.865,
0.864, 0.864, 0.863, 0.863, 0.862, 0.862, 0.861, 0.861, 0.86,
0.86, 0.859, 0.859, 0.858, 0.858, 0.857, 0.857, 0.856, 0.856,
0.855, 0.855, 0.854, 0.854, 0.853, 0.853, 0.852, 0.852, 0.851,
0.851, 0.85, 0.85, 0.849, 0.848, 0.848), y_fem = c(0.974, 0.974,
0.973, 0.973, 0.973, 0.973, 0.973, 0.973, 0.972, 0.972, 0.972,
0.972, 0.972, 0.971, 0.971, 0.971, 0.971, 0.971, 0.971, 0.97,
0.97, 0.97, 0.97, 0.97, 0.969, 0.969, 0.969, 0.969, 0.969, 0.968,
0.968, 0.968, 0.968, 0.968, 0.967, 0.967, 0.967, 0.967, 0.967,
0.966, 0.966, 0.966, 0.966, 0.966, 0.965, 0.965, 0.965, 0.965,
0.965, 0.964, 0.964, 0.964, 0.964, 0.963, 0.963, 0.963, 0.963,
0.963, 0.962, 0.962, 0.962, 0.962, 0.961, 0.961, 0.961, 0.961,
0.961, 0.96, 0.96, 0.96, 0.96, 0.959, 0.959, 0.959, 0.959, 0.958,
0.958, 0.958, 0.958, 0.957, 0.957, 0.957, 0.957, 0.957, 0.956,
0.956, 0.956, 0.956, 0.955, 0.955, 0.955, 0.955, 0.954, 0.954,
0.954, 0.954, 0.953, 0.953, 0.953, 0.952), n_fjernet = c(0, 0.1,
0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 1.1, 1.2, 1.3, 1.4,
1.5, 1.6, 1.7, 1.8, 1.9, 2, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7,
2.8, 2.9, 3, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 4,
4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8, 4.9, 5, 5.1, 5.2, 5.3,
5.4, 5.5, 5.6, 5.7, 5.8, 5.9, 6, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6,
6.7, 6.8, 6.9, 7, 7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 7.7, 7.8, 7.9,
8, 8.1, 8.2, 8.3, 8.4, 8.5, 8.6, 8.7, 8.8, 8.9, 9, 9.1, 9.2,
9.3, 9.4, 9.5, 9.6, 9.7, 9.8, 9.9)), row.names = c(NA, -100L), class = c("data.table",
"data.frame"))
Upvotes: 1
Views: 283
Reputation: 174556
In ggplot
the legend is generated from a scale that you set to identify groupings or values within your data. You have plotted each ribbon and line separately, so there is no scale tying them all together from which a legend can be automatically generated.
I can see why you have done it this way - your variables are all in different columns instead of being variables in a single column. This is one of the occasions when it is best to transform your data into long format for the purposes of plotting, using pivot_longer
from tidyr
.
To simplify the data wrangling, instead of using ribbons, you can use a stacked area plot. This requires you to modify the input data, which we can do easily with mutate
:
library(dplyr)
library(tidyr)
library(ggplot2)
my_labels <- c("5 year probability of death",
"3 year probability of death",
"1 year probability of death")
df <- mutate(nd, y_fem = y_fem - y_tre, y_tre = y_tre - y_et) %>%
tidyr::pivot_longer(1:3) %>%
mutate(name = factor(name, levels = c("y_fem", "y_tre", "y_et")))
ggplot(df, aes(x=n_fjernet, y = value, colour = name, group = name)) +
geom_area(aes(fill = name), position = "stack", alpha = 0.15) +
geom_line(colour = "white", size = 3, position = "stack") +
geom_line(position = "stack") +
geom_point(position = "stack", data = df[c(1:3, -2:0 + nrow(df)), ]) +
scale_fill_manual(values = c("#2C77BF", "#E38072", "#6DBCC3"),
labels = my_labels) +
scale_colour_manual(values = c("#2C77BF", "#E38072", "#6DBCC3"),
labels = my_labels)
Note that the plot is different from the example one: the supplied data is only for the leftmost part of the plot.
Upvotes: 1