Reputation: 21
I have a dataframe called df. There are 20 rows and 2 variables test_value and day. I would like to create a new variable called test_x_max. It will capture the maximum value from the previous x records. Ex: if we assume X is 5 then if we are looking at row 15 , it needs to pick the maximum test_value between day 10 to day 15. How can i achieve this ? Thanks in Advance. Pavan
Upvotes: 1
Views: 66
Reputation: 3923
Newish package called slider
seems appropriate if you like tidyverse
style
library(dplyr)
library(slider)
set.seed(2020)
pretend_df <- tibble(
day = 1:20,
testvalue = sample(100, 20)
)
# if you MUST have 5 days worth
slide_dbl(pretend_df, ~ max(.x$testvalue), .before = 5, .complete = TRUE)
#> [1] NA NA NA NA NA 88 88 88 88 70 70 72 93 93 93 93 93 93 80 82
# if you want to accept less than 5 days worth
slide_dbl(pretend_df, ~ max(.x$testvalue), .before = 5, .complete = FALSE)
#> [1] 28 87 87 88 88 88 88 88 88 70 70 72 93 93 93 93 93 93 80 82
pretend_df$maxlast5 <- slide_dbl(pretend_df, ~ max(.x$testvalue), .before = 5, .complete = TRUE)
> pretend_df
# A tibble: 20 x 3
day testvalue maxlast5
<int> <int> <dbl>
1 1 28 NA
2 2 87 NA
3 3 22 NA
4 4 88 NA
5 5 65 NA
6 6 17 88
7 7 36 88
8 8 42 88
9 9 70 88
10 10 49 70
11 11 56 70
12 12 72 72
13 13 93 93
14 14 80 93
15 15 29 93
16 16 3 93
17 17 66 93
18 18 4 93
19 19 78 80
20 20 82 82
Upvotes: 0
Reputation: 174506
You can use zoo::rollmax
combined with cummax
:
library(zoo)
df$test_x_max <- c(cummax(df$test_value[1:4]), rollmax(df$test_value, 5, align = "right"))
For example:
set.seed(100)
df <- data.frame(day = 1:20, test_value = sample(20))
df$test_x_max <- c(cummax(df$test_value[1:4]), rollmax(df$test_value, 5, align = "right"))
df
#> day test_value test_x_max
#> 1 1 10 10
#> 2 2 6 10
#> 3 3 16 16
#> 4 4 14 16
#> 5 5 12 16
#> 6 6 7 16
#> 7 7 19 19
#> 8 8 17 19
#> 9 9 4 19
#> 10 10 15 19
#> 11 11 13 19
#> 12 12 2 17
#> 13 13 11 15
#> 14 14 8 15
#> 15 15 3 13
#> 16 16 9 11
#> 17 17 1 11
#> 18 18 20 20
#> 19 19 18 20
#> 20 20 5 20
Upvotes: 4