Kerr McIntosh
Kerr McIntosh

Reputation: 131

Is there a way in R / ggplot2 of re-ordering the legend to match it's line position?

Is there a way in R / ggplot2 of re-ordering the legend to match the it's line position?

So in this example the blue non melanoma skin cancer would be top in the legend.

all_nhs_data <- read_csv("https://www.opendata.nhs.scot/dataset/c2c59eb1-3aff-48d2-9e9c-60ca8605431d/resource/3aef16b7-8af6-4ce0-a90b-8a29d6870014/download/opendata_inc9418_hb.csv")

borders_hb_cncr <- all_nhs_data %>% 
  filter(HB == "S08000016") %>% 
  select(CancerSite, Sex, Year, IncidencesAllAges, CrudeRate)

individual_viz <- borders_hb_cncr %>% 
  filter(CancerSite != "All cancer types") %>% 
 filter(case_when(
   IncidencesAllAges >=50 & Year == 2018 ~ Sex == "All",
   TRUE ~ Sex == "All" & IncidencesAllAges >50
             )) %>%  
  ggplot() +
  aes(x = Year, y = IncidencesAllAges, group = CancerSite, colour = CancerSite) +
  geom_line()

Screen Shot

Upvotes: 2

Views: 101

Answers (2)

nniloc
nniloc

Reputation: 4243

The forcats package (part of the tidyverse suite) has a function called fct_reorder2 which is intended for cases like this.

The default function in fct_reorder2 is last2(), which reorders a factor (CancerSite) based on the last value of y (IncidencesAllAges) when sorted by x (Year). See the final example here.

library(tidyverse)

borders_hb_cncr %>% 
  filter(CancerSite != "All cancer types",
         case_when(
           IncidencesAllAges >=50 & Year == 2018 ~ Sex == "All",
           TRUE ~ Sex == "All" & IncidencesAllAges >50
         )) %>%  
  ggplot() +
  aes(x = Year, 
      y = IncidencesAllAges, 
      group = CancerSite, 
      colour = fct_reorder2(CancerSite, 
                            Year, 
                            IncidencesAllAges)) +
  geom_line() +
  labs(colour = 'Cancer Site') 

enter image description here

Upvotes: 3

greg dubrow
greg dubrow

Reputation: 633

My first instinct is to make CancerSite a factor and order it in the level statement the way you want. Might be a way to do it by the value of CancerSite in 2018, which would allow you to reuse code across plot permutations. But for this, I just went with converting to factor. It does change colors from the original. But you can manipulate them manually.

borders_hb_cncr %>% 
filter(CancerSite != "All cancer types") %>% 
filter(case_when(
    IncidencesAllAges >=50 & Year == 2018 ~ Sex == "All",
    TRUE ~ Sex == "All" & IncidencesAllAges >50)) %>%  
mutate(CancerSite = factor(CancerSite, 
                    levels = c("Non-melanoma skin cancer", "Basal cell carcinoma of the skin",
                                "Breast", "Colon", "Colorectal cancer", 
                                "Squamous cell carcinoma of the skin", 
                                "Trachea, bronchus and lung"))) %>%
ggplot() +
aes(x = Year, y = IncidencesAllAges, colour = CancerSite) +
geom_line()

enter image description here

Upvotes: 1

Related Questions