Reputation: 85
I would like to count the number of times that value
appears >20
from a specific date defined by the number 1 of column (ncol = 1
) for the previous 5
days. For example, the date 1997.05.25
how many times the value appears >20
to 1997.05.21
.
Here are the variables:
value <- c(37,19.75,19.5,14.5,24.75,25,25.5,19.75,19.75,14.25,21.25,21.75,17.5,16.25,14.5,
14.5,14.75,9.5,11.75,15.25,14.25,16.5,13.5,18.25,13.5,11.25,10.75,12,8.5,
9.75,14.75)
Date <- c("1997-05-01","1997-05-02","1997-05-03","1997-05-04","1997-05-05",
"1997-05-06","1997-05-07","1997-05-08","1997-05-09","1997-05-10",
"1997-05-11","1997-05-12","1997-05-13","1997-05-14","1997-05-15",
"1997-05-16","1997-05-17","1997-05-18","1997-05-19","1997-05-20",
"1997-05-21","1997-05-22","1997-05-23","1997-05-24","1997-05-25",
"1997-05-26","1997-05-27","1997-05-28","1997-05-29","1997-05-30",
"1997-05-31")
ncol <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0)`
data <- data.frame(value, Date, ncol)
library(data.table)
setDT(data)[, ncol := ifelse(Date %in% c("1997-05-05","1997-05-11","1997-05-14",
"1997-05-18","1997-05-25"), ncol+1, 0) ]
how to do that best? Help would be much appreciated.
Here is my expected result:
1. value Date ncol newcol
2. 37.00 1997-05-05 1 2
3. 19.75 1997-05-11 1 2
4. 19.50 1997-05-14 1 2
5. 14.50 1997-05-18 1 0
6. 24.75 1997-05-25 1 0
Upvotes: 2
Views: 415
Reputation: 70266
If I understand correctly you can use Reduce
and shift
:
library(data.table)
DT <- data.table(value, Date)
DT[, ncol := Reduce(`+`, shift(value > 20, 1:5, fill = 0))]
What this does: it counts, for each row of DT, how many time the value
-column in the previous 5 rows was greater than 20 and writes that into column ncol
.
If you wanted to change this 5-row "window" from the previous 5 days to include the current day + the previous 4 rows, you would change the code to:
DT[, ncol := Reduce(`+`, shift(value > 20, 0:4, fill = 0))]
Upvotes: 1
Reputation: 604
Here is a solution using dplyr
:
library(dplyr)
data$Date <- as.Date(data$Date)
data %>%
filter(between(Date, "1997-05-01", "1997-05-21"),
value > 20) %>%
nrow()
Upvotes: 0