Peter Skeels
Peter Skeels

Reputation: 37

Mutate percentile rank based on two columns

I've previously asked the following question: Mutate new column over a large list of tibbles & the solutions giving were perfect. I now have a follow-up question to this.

I now have the following dataset:

df1:

name group competition metric value
A A comp A distance 10569
B A comp B distance 12939
C A comp C distance 11532
A A comp B psv-99 29.30
B A comp A psv-99 30.89
C A comp C psv-99 32.00

I now want to find out the percentile rank of all the values in df1, but only based on the group & one of the competitions - competition A.

Upvotes: 1

Views: 532

Answers (3)

akrun
akrun

Reputation: 887771

We could slice the rows where the 'comp A' is found %in% competition, then do a grouping by 'group' column and create a new column percentile with percent_rank

library(dplyr)
df <- df %>%
   slice(which(competition %in% "comp A")) %>%
   group_by(group) %>%
   mutate(percentile = percent_rank(value))

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 389235

You can filter the competition and group_by group.

library(dplyr)

df %>%
  filter(competition == "comp A") %>%
  group_by(group) %>%
  mutate(percentile = percent_rank(value))

Upvotes: 1

Zaw
Zaw

Reputation: 1474

Maybe just change metric to competition in the previous code? It would give you the percentile rank for all competitions, including A.

df1 %>% 
  group_nest(group, competition) %>% 
  mutate(percentile = map(data, ~percent_rank(.$value))) %>% 
  unnest(c(data, percentile))

Upvotes: 1

Related Questions