Reputation: 37
I've previously asked the following question: Mutate new column over a large list of tibbles & the solutions giving were perfect. I now have a follow-up question to this.
I now have the following dataset:
df1:
name | group | competition | metric | value |
---|---|---|---|---|
A | A | comp A | distance | 10569 |
B | A | comp B | distance | 12939 |
C | A | comp C | distance | 11532 |
A | A | comp B | psv-99 | 29.30 |
B | A | comp A | psv-99 | 30.89 |
C | A | comp C | psv-99 | 32.00 |
I now want to find out the percentile rank of all the values in df1, but only based on the group & one of the competitions - competition A.
Upvotes: 1
Views: 532
Reputation: 887771
We could slice
the rows where the 'comp A' is found %in%
competition
, then do a grouping by 'group' column and create a new column percentile
with percent_rank
library(dplyr)
df <- df %>%
slice(which(competition %in% "comp A")) %>%
group_by(group) %>%
mutate(percentile = percent_rank(value))
Upvotes: 2
Reputation: 389235
You can filter
the competition
and group_by
group
.
library(dplyr)
df %>%
filter(competition == "comp A") %>%
group_by(group) %>%
mutate(percentile = percent_rank(value))
Upvotes: 1
Reputation: 1474
Maybe just change metric
to competition
in the previous code? It would give you the percentile rank for all competitions, including A.
df1 %>%
group_nest(group, competition) %>%
mutate(percentile = map(data, ~percent_rank(.$value))) %>%
unnest(c(data, percentile))
Upvotes: 1