Select rows equal to or less than maximum value (i.e. delete trailing maximums) with group_by in R

Question

I am trying to delete all observations (rows) after a maximum value is reached when using group_by. In other words, I want to select only the observations equal to or less than a maximum in each group. I have a dataset similar to the below in which I want to group_by(Group1, Group2)

UID<-1:30
Group1<-c(rep("jedi", 10), rep("sith", 10), rep("powerless", 10))
Group2<-c(rep("blue", 5), rep("green", 5), rep("purple", 5), rep("red", 5), rep("yellow", 5), rep("orange", 5))
Value<- c(1,2,3,3,3,0,0,1,1,1,1,1,1,2,3,0,2,4,4,4,2,3,3,5,5,0,3,5,5,5)
df<-data.frame(UID, Group1, Group2, Value)

My expected output would look like this;

UID.Expected<-c(1,2,3,6,7,8,11,12,13,14,15,16,17,18,21,22,23,25,26,27,28)
df.expected <- df %>% filter(UID %in% UID.Expected)

I have tried top_n, slice_min/max, select, and filter but can't figure it out. I know it something super simple but can't quite get there. Assistance is greatly appreciated.

akrun · Accepted Answer

We can use match to get the index of the first 'max' 'Value' per group, get the sequence and use that in slice

library(dplyr)
df %>% 
   group_by(Group1, Group2) %>% 
   slice(seq(match(max(Value), Value))) %>%
   ungroup %>%
   arrange(UID) %>%
   as.data.frame

-output

#   UID    Group1 Group2 Value
#1    1      jedi   blue     1
#2    2      jedi   blue     2
#3    3      jedi   blue     3
#4    6      jedi  green     0
#5    7      jedi  green     0
#6    8      jedi  green     1
#7   11      sith purple     1
#8   12      sith purple     1
#9   13      sith purple     1
#10  14      sith purple     2
#11  15      sith purple     3
#12  16      sith    red     0
#13  17      sith    red     2
#14  18      sith    red     4
#15  21 powerless yellow     2
#16  22 powerless yellow     3
#17  23 powerless yellow     3
#18  24 powerless yellow     5
#19  26 powerless orange     0
#20  27 powerless orange     3
#21  28 powerless orange     5

Select rows equal to or less than maximum value (i.e. delete trailing maximums) with group_by in R

Answers (1)

Related Questions