Karthik g
Karthik g

Reputation: 305

arrange() not working on grouped data frame

Assume I have the following code. In the last step, where I try arranging it, the code doesn't work and the data frame continues to be arranged in ascending order by cyl.

library(dplyr)
# create a grouped data frame
df <- group_by(mtcars,cyl)
# rank car from best mpg to worst for every cyl
df <- mutate(df,rank = row_number(mpg)) 
# top 3 best mpg for each cyl
df <- filter(df,rank <= 3) 
# arrange by the number of cyl
df <- arrange(df,desc(cyl), rank) 

Any thoughts on why this is happening ?

Upvotes: 20

Views: 15177

Answers (1)

Rich Scriven
Rich Scriven

Reputation: 99331

It's not working because you need to ungroup() the data before arranging by cyl. The code you are using attempts to order the cyl column while it's still grouped by cyl. Since those values are all the same (within each group), nothing changes.

To arrange the entire data by cyl after ranking, we need to remove the grouping first, and then we can run arrange() again.

library(dplyr)

group_by(mtcars, cyl) %>%                ## group by cylinder
    mutate(rank = row_number(mpg)) %>%   ## rank by mpg
    filter(rank <= 3) %>%                ## top three for each cyl
    arrange(rank) %>%                    ## arrange each group by rank
    ungroup() %>%                        ## remove grouping
    arrange(desc(cyl))                   ## arrange all by cylinder (descending)

#    mpg cyl  disp  hp drat    wt  qsec vs am gear carb rank
# 1 10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4    1
# 2 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4    2
# 3 13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4    3
# 4 17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4    1
# 5 18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1    2
# 6 19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4    3
# 7 21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2    1
# 8 21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1    2
# 9 22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1    3

As a side note, I would recommend that you consider using the %>% function for chaining these calls together as it will considerably cut down on assignments made with <-.

Upvotes: 31

Related Questions