philipp.kn_98
philipp.kn_98

Reputation: 149

R aggregate column until one condition is met

so I´m having a dataframe of this form:

ID   Var1   Var2
1     1      1
1     2      2
1     3      3
1     4      2
1     5      2
2     1      4
2     2      8
2     3      10
2     4      10
2     5      7

and I would like to filter the Var1 values by group for their maximum, on the condition, that the maximum value of Var2 is not met. This will be part of a new dataframe only containing one row per ID, so the outcome should be something like this:

ID   Var1
1     2
2     2

so the function should filter the dataframe for the maximum, but only consider the values in the rows before Var2 reaches it´s maximum. The rows containing the maximum itself should not be included and so shouldn´t the rows after the maximum. I tried building something with the while loop, but it didn´t work out. Also I´d be thankful if the solution doesn´t employ data.table

Thanks in advance

Upvotes: 0

Views: 211

Answers (1)

user12728748
user12728748

Reputation: 8506

Maybe you could do something like this:

DF <- structure(list(
  ID = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), 
  Var1 = c(1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L), 
  Var2 = c(1L, 2L, 3L, 2L, 2L, 4L, 8L, 10L, 10L, 7L)), 
  class = "data.frame", row.names = c(NA, -10L))

library(dplyr)

DF %>% group_by(ID) %>% 
  slice(1:(which.max(Var2)-1)) %>% 
  slice_max(Var1) %>% 
  select(ID, Var1)
#> # A tibble: 2 x 2
#> # Groups:   ID [2]
#>      ID  Var1
#>   <int> <int>
#> 1     1     2
#> 2     2     2

Created on 2020-08-04 by the reprex package (v0.3.0)

Upvotes: 1

Related Questions