Reputation: 323
I am attempting to find the top n cyl by measure of AVGMPG for each carb, and then ommit everything else in the data frame. My actual problem involves identifying the top sales staff by market by measure of close rate. Hopefully that will make it more clear what I am attempting to do. Is there a way to easily do this?
> mtcars.1 <- mtcars %>%
+ group_by(carb,cyl) %>%
+ summarise(AVGMPG = mean(mpg))
> mtcars.1
# A tibble: 9 x 3
# Groups: carb [?]
carb cyl AVGMPG
<dbl> <dbl> <dbl>
1 1 4 27.6
2 1 6 19.8
3 2 4 25.9
4 2 8 17.2
5 3 8 16.3
6 4 6 19.8
7 4 8 13.2
8 6 6 19.7
9 8 8 15
Upvotes: 2
Views: 69
Reputation: 6441
A possible solution with data.table
data(mtcars)
setDT(mtcars)
mtcars[,AVGMPG := mean(mpg), keyby = list(carb, cyl)]
mtcars[order(-AVGMPG),head(.SD, n = 3), by = carb][,.(carb, cyl, AVGMPG)]
carb cyl AVGMPG
1: 1 4 27.58
2: 1 4 27.58
3: 1 4 27.58
4: 2 4 25.90
5: 2 4 25.90
6: 2 4 25.90
7: 4 6 19.75
8: 4 6 19.75
9: 4 6 19.75
10: 6 6 19.70
11: 3 8 16.30
12: 3 8 16.30
13: 3 8 16.30
14: 8 8 15.00
This calculates the mean of mpg
for carb
and cyl
, order the carb
-groups by AVGMPG
and pick the top 3
elements, then it discards all other columns.
Upvotes: 1
Reputation: 886968
We can try
library(dplyr)
n <- 3
mtcars %>%
group_by(carb) %>%
mutate(AVGMPG = mean(mpg)) %>%
group_by(cyl) %>%
top_n(n, AVGMPG) %>%
select(carb, cyl, AVGMPG)
Upvotes: 1