Nader Mehri
Nader Mehri

Reputation: 556

how to add label to selected lines in a chart created by geom_line?

Using the code below, I could create my chart of interest (please see the image below). Each line represents a state. I would like to customize the chart by making all the lines grey except for the 5 highest rates in my outcome variable of interest (logageadjustedrate) in the last year (Year=2016). I also would like to add the stat name to the 5 highest rates. But I got an error: "geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic?"

Here is my code:

library(dplyr, warn.conflicts = FALSE)

d_filtered <- policy_and_cause_of_deaths_fe %>%
  group_by(state_name) %>% 
  filter(Year=='2016' & logageadjustedrate>5.5) %>%  
  ungroup()

    ggplot() +
      # draw the original data series with grey
      geom_line(aes(x =Year, y = logageadjustedrate, group=state_name), data = policy_and_cause_of_deaths_fe, colour = alpha("grey", 0.7)) +
      # colourise only the filtered data
      geom_line(aes(x =Year, y = logageadjustedrate, colour = state_name), data = d_filtered)+
      theme(legend.position="none")

enter image description here

Upvotes: 0

Views: 279

Answers (1)

stefan
stefan

Reputation: 124148

Your issue is that your filtered dataset contains only one year. But you can't make a line with only one data point, that's why you get the warning.

Using only ggplot2 and dplyr you can achieve your desired result like so:

  1. Get the names of your top 5 states in 2016. Instead of hard coding the value I make use of dplyr::top_n to get the top 5 states.

  2. For highlighting the lines filter your dataset for the top states. This way your code will work.

  3. For adding labels (and/or highlighting only the last year) filter additionally by the last year.

  4. To add the labels use the dataset from 3. to add the state names via geom_text (addtionally I added a geom_point to highlight the last year for the top states.

Using some random data try this:

# Random data
set.seed(2)
d <- data.frame(
  Year = rep(2009:2016, 52),
  logageadjustedrate = runif(8 * 52, 4, 7),
  state_name = rep(c(LETTERS, letters), each = 8)
)

library(ggplot2)
library(dplyr, warn.conflicts = FALSE)

top_states <- d %>%
  filter(Year=='2016') %>% 
  top_n(5, logageadjustedrate) %>% 
  pull(state_name)

d_filtered_line <- d %>% 
  filter(state_name %in% top_states)

d_filtered_point <- d %>% 
  filter(state_name %in% top_states, Year == 2016)

ggplot(mapping = aes(x =Year, y = logageadjustedrate)) +
  # draw the original data series with grey
  geom_line(aes(group=state_name), data = d, colour = alpha("grey", 0.7)) +
  # colourise only the filtered data
  geom_line(aes(colour = state_name), data = d_filtered_line)+
  geom_point(aes(colour = state_name), data = d_filtered_point)+
  geom_text(aes(colour = state_name, label = state_name), nudge_x = .2, data = d_filtered_point)+
  theme(legend.position="none")

Upvotes: 1

Related Questions