Reputation: 1303
consider data below( from )
tolerance <- read.csv("https://stats.idre.ucla.edu/stat/r/examples/alda/data/tolerance1_pp.txt")
## change id and male to factor variables
tolerance <- within(tolerance, {
id <- factor(id)
male <- factor(male, levels = 0:1, labels = c("female", "male"))
})
## view the first few rows of the dataset
head(tolerance)
id age tolerance male exposure time
1 9 11 2.23 female 1.54 0
2 9 12 1.79 female 1.54 1
3 9 13 1.90 female 1.54 2
4 9 14 2.12 female 1.54 3
5 9 15 2.66 female 1.54 4
6 45 11 1.12 male 1.16 0
I will have a time series plot as below,
ggplot(data = tolerance, aes(x = time, y = tolerance, group = id, color= id)) +
geom_line() +
geom_point()
I do not want ALL the lines, but only top three lines ( representing most frequent id s in each time point )
I tried top_n(3,tolerance)
but it does not give three top lines . it gives three top points not surprisingly.
any idea how to get to this?
Upvotes: 0
Views: 119
Reputation: 79184
Maybe something like this. First create the mean of tolerance , then filter the top 3 mean tolerance and plot:
library(tidyverse)
tolerance %>%
group_by(id) %>%
mutate(group_mean = mean(tolerance, na.rm = TRUE)) %>%
arrange(group_mean, .by_group = TRUE) %>%
filter(cur_group_id()<=3) %>%
ggplot(aes(x = time, y = tolerance, group = id, color= id)) +
geom_point()+
geom_line()
Upvotes: 1