PhiloT
PhiloT

Reputation: 181

How to "bring along another variable" with dplyr::summarise

I want to get the maximum value in each combination of conditions, but also bring along the value of another variable of the same index.

df <- mtcars %>%
  group_by(gear,carb) %>%
  summarise(max_cyl = max(cyl))

But what do I do to "bring along" the corresponding mpg for each car with its maximum? This seems like a basic thing, but it also appears to be absent from the dplyr tutorials.

In other words, I want to select only those cars with the maximum number of cylinders in each gear x carb condition, and know the gas mileage for that same car.

Upvotes: 1

Views: 130

Answers (2)

Ben Bolker
Ben Bolker

Reputation: 226322

Another possibility:

df2 <- (mtcars
  %>% group_by(gear,carb) 
  %>% filter(cyl==max(cyl)) 
  %>% select(cyl,mpg)
)

(or select(gear,carb,cyl,mpg) in the last line if you want to avoid a message about "Adding missing grouping variables")

This approach would be convenient if you wanted to capture several more variables and didn't want to keep typing which.max().

Upvotes: 3

PhiloT
PhiloT

Reputation: 181

I discovered the which.max() function works for this.

df1 <- mtcars %>%
  group_by(gear,carb) %>%
  summarise(max_cyl = max(cyl),
            mpg = mpg[which.max(cyl)])

Upvotes: 3

Related Questions