san1
san1

Reputation: 515

Assign max n value based on group

I am trying to get out_val by considering max value of '2' from '1' from val1 and val2
Ex: Group 3 from val1
in row-1 value '0' appears so its value in out_value is '0'
in row-2 value '1' appears so its value in out_value is '1'
in row-3 value '1' appears so its value in out_value is '1'
in row-4 value '2' appears so its value in out_value is '2' (here i achevided 2)
from row-5 to 6 value '0' would be appeared

Group 6 from val1
in row-7 value '1' appears so its value in out_value is '1'
in row-8 value '2' appears so its value in out_value is '2' (here i achevided 2)
from row-9 to 10 value '0' would be appeared

val1 = c(3,3,3,3,3,3,6,6,6,6)
val2 = c(0,1,1,2,1,2,1,2,1, 2)

df = data.frame(val1,val2)

out_value = c(0,1,1,2,0,0,1,2,0,0)
df_out = data.frame(val1,val2,out_value)

Upvotes: 3

Views: 71

Answers (4)

TarJae
TarJae

Reputation: 78927

Here is an alternative solution using ifelse cumsum lag max:

  1. Group by val1
  2. define group where the max value occurs first by applying cumsum to the lag of val2
  3. Within an ifelse statement then set everything that is 0 in the intermediate column to val2 in the out_val column.
library(dplyr)
df %>% 
  group_by(val1) %>% 
  mutate(out_val = ifelse(
    cumsum(lag(val2== max(val2), default=FALSE) == TRUE)== 0, val2, 0))
   val1  val2 out_val
   <dbl> <dbl>   <dbl>
 1     3     0       0
 2     3     1       1
 3     3     1       1
 4     3     2       2
 5     3     1       0
 6     3     2       0
 7     6     1       1
 8     6     2       2
 9     6     1       0
10     6     2       0

Upvotes: 2

Ma&#235;l
Ma&#235;l

Reputation: 51974

Use dplyr::cumany if you want values to be zero after the first 2, dplyr::cummax if you want the value to be the maximum of each group:

df %>% 
  group_by(val1) %>% 
  mutate(out_val = ifelse(lag(cumany(val2 == 2), default=0), 0, val2))

# A tibble: 10 x 3
# Groups:   val1 [2]
    val1  val2 out_val
   <dbl> <dbl>   <dbl>
 1     3     0       0
 2     3     1       1
 3     3     1       1
 4     3     2       2
 5     3     1       0
 6     3     2       0
 7     6     1       1
 8     6     2       2
 9     6     1       0
10     6     2       0

Upvotes: 2

r2evans
r2evans

Reputation: 160437

I'm interpreting your logic as:

  • cumulative max until the first "2" is encountered;
  • "0" after the first "2"
  • all this "by group" of val1

base R

df$out_value <- ave(df$val2, df$val1, FUN = function(z) {
  out <- cummax(z)
  ifelse(c(FALSE, out[-length(z)] == 2), 0, out)
})
df
#    val1 val2 out_value
# 1     3    0         0
# 2     3    1         1
# 3     3    1         1
# 4     3    2         2
# 5     3    1         0
# 6     3    2         0
# 7     6    1         1
# 8     6    2         2
# 9     6    1         0
# 10    6    2         0

Upvotes: 1

AndS.
AndS.

Reputation: 8110

I'm not sure if this answer will be robust given your real data, but you could just populate 0 after the first instance of 2 by group:

library(tidyverse)

df |>
  group_by(val1) |>
  mutate(out_value = ifelse(row_number() > which(val2 == 2)[[1]], 0, val2))
#> # A tibble: 10 x 3
#> # Groups:   val1 [2]
#>     val1  val2 out_value
#>    <dbl> <dbl>     <dbl>
#>  1     3     0         0
#>  2     3     1         1
#>  3     3     1         1
#>  4     3     2         2
#>  5     3     1         0
#>  6     3     2         0
#>  7     6     1         1
#>  8     6     2         2
#>  9     6     1         0
#> 10     6     2         0

Upvotes: 1

Related Questions