Reputation: 65
I am working on a school project and have a data set of 4,000 rows. There are 40 participants and each has about 100 rows. I want to create a data set that collapse the rows for each participant into summary statsitics, ideally the 90th percentile. I know how to find the mean values with dplyr:
Means <- bladder %>%
group_by(id, group) %>%
summarise(across(everything(), list(mean)))
And this works great. But is there somehow I could do the same thing but instead list the 90th percentiles instead of means?
Thank you!!
Upvotes: 4
Views: 694
Reputation: 188
the following code also gives the solution
Percentile90 <- survival::bladder %>%
group_by(id, rx) %>%
summarise(across(everything(),
quantile, probs = 0.9, na.rm = T))
Upvotes: 0
Reputation: 19097
The function to calculate percentile in R is quantile
. We can specify probs = 0.9
to get 90th percentile.
Here I use the bladder
dataset from the survival
package to demonstrate.
library(dplyr)
survival::bladder %>%
group_by(id, rx) %>%
summarize(across(everything(), quantile, probs = 0.9, .groups = "drop"))
# A tibble: 85 × 7
id rx number size stop event enum
<int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 1 3 1 0 3.7
2 2 1 2 1 4 0 3.7
3 3 1 1 1 7 0 3.7
4 4 1 5 1 10 0 3.7
5 5 1 4 1 10 0.7 3.7
6 6 1 1 1 14 0 3.7
7 7 1 1 1 18 0 3.7
8 8 1 1 3 18 0.7 3.7
9 9 1 1 1 18 1 3.7
10 10 1 3 3 23 0 3.7
# … with 75 more rows
Upvotes: 4