Reputation: 331
Say I have a data frame like this:
group1 <- c('a','a','a','a','a','a','b','b','b','b','b','b','b','b')
group2 <- c('x','y','x','y','x','y','x','y','x','y','x','y','x','y')
value <- round(runif(14, min=0, max=1), digits = 2)
df1 <- as.data.frame(cbind(group1,group2,value))
df1$value <- as.numeric(df1$value)
It is easy to get a new data frame with only the maximum values of each group, by using the dplyr
package and summarise
function:
df2 <- summarise(group_by(df1,group1),max_v = max(value))
But what I want is a new data frame with the 3 maximum values of each group, doing something like that:
df2 <- summarise(group_by(df1,group1),max_v = max(value),max2_v = secondmax(value),max3_v = thirdmax(value))
Is there a way to do that without using the sort
function ?
Upvotes: 5
Views: 1519
Reputation: 887291
We can use arrange/slice/spread
way to get this
library(dplyr)
library(tidyr)
df1 %>%
group_by(group1) %>%
arrange(desc(value)) %>%
slice(seq_len(3)) %>%
mutate(Max = paste0("max_", row_number())) %>%
select(-group2) %>%
spread(Max, value)
# A tibble: 2 x 4
# Groups: group1 [2]
# group1 max_1 max_2 max_3
#* <fctr> <dbl> <dbl> <dbl>
#1 a 0.84 0.69 0.41
#2 b 0.89 0.72 0.54
df1 <- data.frame(group1,group2,value)
Upvotes: 3