Devin
Devin

Reputation: 323

How to filter to top n for specified column?

I am attempting to find the top n cyl by measure of AVGMPG for each carb, and then ommit everything else in the data frame. My actual problem involves identifying the top sales staff by market by measure of close rate. Hopefully that will make it more clear what I am attempting to do. Is there a way to easily do this?

> mtcars.1 <- mtcars %>%
+   group_by(carb,cyl) %>%
+   summarise(AVGMPG = mean(mpg))
> mtcars.1
# A tibble: 9 x 3
# Groups:   carb [?]
   carb   cyl AVGMPG
  <dbl> <dbl>  <dbl>
1     1     4   27.6
2     1     6   19.8
3     2     4   25.9
4     2     8   17.2
5     3     8   16.3
6     4     6   19.8
7     4     8   13.2
8     6     6   19.7
9     8     8   15

Upvotes: 2

Views: 69

Answers (2)

Humpelstielzchen
Humpelstielzchen

Reputation: 6441

A possible solution with data.table

data(mtcars)          

setDT(mtcars)

mtcars[,AVGMPG := mean(mpg), keyby = list(carb, cyl)] 
mtcars[order(-AVGMPG),head(.SD, n = 3), by = carb][,.(carb, cyl, AVGMPG)]

    carb cyl AVGMPG
 1:    1   4  27.58
 2:    1   4  27.58
 3:    1   4  27.58
 4:    2   4  25.90
 5:    2   4  25.90
 6:    2   4  25.90
 7:    4   6  19.75
 8:    4   6  19.75
 9:    4   6  19.75
10:    6   6  19.70
11:    3   8  16.30
12:    3   8  16.30
13:    3   8  16.30
14:    8   8  15.00

This calculates the mean of mpg for carband cyl, order the carb-groups by AVGMPG and pick the top 3 elements, then it discards all other columns.

Upvotes: 1

akrun
akrun

Reputation: 886968

We can try

library(dplyr)
n <- 3
mtcars %>% 
   group_by(carb) %>% 
   mutate(AVGMPG = mean(mpg)) %>%  
   group_by(cyl) %>%     
   top_n(n, AVGMPG) %>%
   select(carb, cyl, AVGMPG)

Upvotes: 1

Related Questions