Marco Pastor Mayo
Marco Pastor Mayo

Reputation: 853

Use ntile() with group_by() with dplyr

I want to calculate the quintile of groups in a data.frame such as this:

df <- data.frame(x=1:100, y=c(rep("A", 50), rep("B", 50)))

Using the ntile() function and group_by from dplyr, I thought I could get the grouped quintiles such as here. However, as we can see from the table, the quintiles have been calculate with respect to the whole dataset. I would want to get a result where there is 10 for each quintile for A and B in this case.

df$z <- df %>% group_by(y) %>% mutate(z = ntile(x, 5)) %>% pull(z)

table(df$y, df$z)

     1  2  3  4  5
  A 20 20 10  0  0
  B  0  0 10 20 20

Upvotes: 5

Views: 4946

Answers (1)

Cettt
Cettt

Reputation: 11981

make sure to start a new R-session and try this:

library(dplyr)
df <- data.frame(x=1:100, y=c(rep("A", 50), rep("B", 50))) %>% 
   group_by(y) %>% mutate(z = ntile(x, 5))

table(df$y, df$z)
     1  2  3  4  5
  A 10 10 10 10 10
  B 10 10 10 10 10

Also, a dplyr alternative to table would be count:

count(df, y, z)

Upvotes: 7

Related Questions