mcl3743
mcl3743

Reputation: 1

How can I add a customized legend for multiple layers in ggplot2?

I need to make a line plot in ggplot2 using a data frame called df that looks something like this:

   DATE       ITEM        NUMBER_SOLD
    <date>     <chr>       <int>
 1 2018-01-08 APPLE         3
 2 2018-01-09 APPLE         3
 3 2018-01-09 PEAR          2
 4 2018-01-09 ORANGE        1
 5 2018-01-10 APPLE         2
 6 2018-01-10 PEAR          1
 7 2018-01-12 CHERRY        2
 8 2018-01-12 MANGO         1
 9 2018-01-15 PINEAPPLE     1
10 2018-01-15 APRICOT       1

etc

The data frame is basically a tibble showing how many times a particular item was sold on a given day in 2018 with a total of 336 rows.

The plot needs to be a line plot showing the sale of one particular item (apple) with the date on the x axis, number sold on the y axis and an additional line on the y axis showing a 15% increase in sales like this:

df %>% filter(ITEM == "APPLE") %>%
  ggplot(aes(DATE, NUMBER_SOLD)) +
  geom_line(size = 1, col = "red") +
  theme(axis.text.x = element_text(angle = 90)) +
  geom_line(aes(y = NUMBER_SOLD + NUMBER_SOLD/100*15), col = "green4", size = 1, alpha = 0.6) +
  scale_x_date(date_labels="%b", date_breaks  = "1 month")

However, I would also need to add a legend to show what both lines represent, e.g. red colored line representing the original number of sales and the green one representing the original number of sales + 15%. How might I achieve that?

Upvotes: 0

Views: 61

Answers (1)

Phil
Phil

Reputation: 8107

The trick is to do the calculation in the data frame first, then use gather() to turn the data to long and have the numbers into one column with another variable indicating whether each number is for actual or expected sale.

library(tidyverse)

df <- tribble(~"DATE",       ~"ITEM",        ~"NUMBER_SOLD",
"2018-01-08", "APPLE",         3,
"2018-01-09", "APPLE",         3,
"2018-01-09", "PEAR",          2,
"2018-01-09", "ORANGE",        1,
"2018-01-10", "APPLE",         2,
"2018-01-10", "PEAR",          1,
"2018-01-12", "CHERRY",        2,
"2018-01-12", "MANGO",         1,
"2018-01-15", "PINEAPPLE",     1,
"2018-01-15", "APRICOT",       1) %>% 
  mutate(DATE = parse_date(DATE),
         NUMBER_SOLD_EXP = NUMBER_SOLD + NUMBER_SOLD/100*15) %>% 
  gather(key = category, value = SOLD, NUMBER_SOLD, NUMBER_SOLD_EXP)

df
# A tibble: 20 x 4
   DATE       ITEM      category         SOLD
   <date>     <chr>     <chr>           <dbl>
 1 2018-01-08 APPLE     NUMBER_SOLD      3   
 2 2018-01-09 APPLE     NUMBER_SOLD      3   
 3 2018-01-09 PEAR      NUMBER_SOLD      2   
 4 2018-01-09 ORANGE    NUMBER_SOLD      1   
 5 2018-01-10 APPLE     NUMBER_SOLD      2   
 6 2018-01-10 PEAR      NUMBER_SOLD      1   
 7 2018-01-12 CHERRY    NUMBER_SOLD      2   
 8 2018-01-12 MANGO     NUMBER_SOLD      1   
 9 2018-01-15 PINEAPPLE NUMBER_SOLD      1   
10 2018-01-15 APRICOT   NUMBER_SOLD      1   
11 2018-01-08 APPLE     NUMBER_SOLD_EXP  3.45
12 2018-01-09 APPLE     NUMBER_SOLD_EXP  3.45
13 2018-01-09 PEAR      NUMBER_SOLD_EXP  2.3 
14 2018-01-09 ORANGE    NUMBER_SOLD_EXP  1.15
15 2018-01-10 APPLE     NUMBER_SOLD_EXP  2.3 
16 2018-01-10 PEAR      NUMBER_SOLD_EXP  1.15
17 2018-01-12 CHERRY    NUMBER_SOLD_EXP  2.3 
18 2018-01-12 MANGO     NUMBER_SOLD_EXP  1.15
19 2018-01-15 PINEAPPLE NUMBER_SOLD_EXP  1.15
20 2018-01-15 APRICOT   NUMBER_SOLD_EXP  1.15

Now you just need to call geom_line once, using the colour argument on the variable indicating whether the number is actual or expected sold. You'll need to add scale_colour_manual() to specify what colour you want to attach to the categories.

df %>% filter(ITEM == "APPLE") %>%
  ggplot(aes(DATE, SOLD)) +
  geom_line(aes(colour = category), size = 1) +
  scale_colour_manual(values = c("NUMBER_SOLD" = "red", "NUMBER_SOLD_EXP" = "green")) +
  theme(axis.text.x = element_text(angle = 90)) +
  scale_x_date(date_labels="%b", date_breaks  = "1 month")

enter image description here

Upvotes: 2

Related Questions