Mathica
Mathica

Reputation: 1303

How to show top n lines of grouped line plots in ggplot R

consider data below( from )

read in data set (tolerance data from ALDA book)

tolerance <-  read.csv("https://stats.idre.ucla.edu/stat/r/examples/alda/data/tolerance1_pp.txt")
 
 ## change id and male to factor variables
 tolerance <- within(tolerance, {
   id <- factor(id)
   male <- factor(male, levels = 0:1, labels = c("female", "male"))
 })
 
 ## view the first few rows of the dataset
 head(tolerance)

  id age tolerance   male exposure time
1  9  11      2.23 female     1.54    0
2  9  12      1.79 female     1.54    1
3  9  13      1.90 female     1.54    2
4  9  14      2.12 female     1.54    3
5  9  15      2.66 female     1.54    4
6 45  11      1.12   male     1.16    0

I will have a time series plot as below,

ggplot(data = tolerance, aes(x = time, y = tolerance, group = id, color= id)) +
   geom_line() +
   geom_point()

enter image description here

I do not want ALL the lines, but only top three lines ( representing most frequent id s in each time point )

I tried top_n(3,tolerance) but it does not give three top lines . it gives three top points not surprisingly.

any idea how to get to this?

Upvotes: 0

Views: 119

Answers (1)

TarJae
TarJae

Reputation: 79184

Maybe something like this. First create the mean of tolerance , then filter the top 3 mean tolerance and plot:

library(tidyverse)

tolerance %>% 
  group_by(id) %>% 
  mutate(group_mean = mean(tolerance, na.rm = TRUE)) %>% 
  arrange(group_mean, .by_group = TRUE) %>% 
  filter(cur_group_id()<=3) %>% 
  ggplot(aes(x = time, y = tolerance, group = id, color= id)) +
  geom_point()+
  geom_line()

enter image description here

Upvotes: 1

Related Questions