Rincewind
Rincewind

Reputation: 131

Vectorized solution for using previous value in a column under some condition

This is probably easy, but beyond a clunky for loop I haven't been able to find a vectorized solution for this.

df <- tibble(a=c(1,2,3,4,3,2,5,6,9), b=c(1,2,3,4,4,4,5,6,9))

Column a should be continuously increasing and should look like column b. So, whenever the next value in a is smaller than the previous value in a, the previous value should be used instead.

Thanks!

Upvotes: 0

Views: 51

Answers (2)

s_baldur
s_baldur

Reputation: 33488

Using cummax() from base R:

df[["b1"]] <- cummax(df[["a"]])

> df
  a b b1
1 1 1  1
2 2 2  2
3 3 3  3
4 4 4  4
5 3 4  4
6 2 4  4
7 5 5  5
8 6 6  6
9 9 9  9

Using more dplyr syntax:

df %>% 
  mutate(b1 = cummax(a))

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388962

We can use lag and fill from tidyverse

library(tidyverse)

df %>%
 mutate(b1 = replace(a, a < lag(a), NA)) %>%
 fill(b1)


#      a     b    b1
#  <dbl> <dbl> <dbl>
#1     1     1     1
#2     2     2     2
#3     3     3     3
#4     4     4     4
#5     3     4     4
#6     2     4     4
#7     5     5     5
#8     6     6     6
#9     9     9     9

The logic being we replace the values in a with NA where the previous value is greater than the next and then use fill to replace those NAs with last non-NA value.

Upvotes: 2

Related Questions