Reputation: 31
This is my repex:
dates <- seq(as.POSIXct("2015-01-01 13:10:00", tz = "UTC"), as.POSIXct("2015-01-01 13:10:10", tz="UTC"), by="1 sec")
dates[dst(dates)] <- dates[dst(dates)] - 3600
datavalues <- data.frame(x=c(90,90,80,65,NA,64,71,75,62,63,74))
data <- cbind(dates,datavalues)
data
dates x
1 2015-01-01 13:10:00 90
2 2015-01-01 13:10:01 90
3 2015-01-01 13:10:02 80
4 2015-01-01 13:10:03 65
5 2015-01-01 13:10:04 NA
6 2015-01-01 13:10:05 64
7 2015-01-01 13:10:06 71
8 2015-01-01 13:10:07 75
9 2015-01-01 13:10:08 62
10 2015-01-01 13:10:09 63
11 2015-01-01 13:10:10 74
I would have to obtain the following data frame (which I will concatenate to data):
results <- data.frame(Duration=c(3,3,3,0,0,0,2,2,0,0,1),Maxx=c(90,90,90,0,0,0,75,75,0,0,74),Delta=c(0,0,0,0,0,0,7,0,0,0,11))
results
Duration Maxx Delta
1 3 90 0
2 3 90 0
3 3 90 0
4 0 0 0
5 0 0 0
6 0 0 0
7 2 75 7
8 2 75 0
9 0 0 0
10 0 0 0
11 1 74 11
I fix a threshold to 70.
I would like if possible to get code using dplyr because around this pice of code, there is already dplyr code. Thank you in advance.
Upvotes: 1
Views: 67
Reputation: 389175
With the help of data.table
rleid
you can create group of consecutive values which are above or below the threshold and calculate the numbers in each group.
library(dplyr)
library(data.table)
threshold <- 70
data %>%
#Create a unique group of consecutive values
group_by(group = rleid(replace(x, is.na(x), 0) < threshold)) %>%
#If the value is less than threshold put 0 in duration or else
#include number of observations in the group. Do the same for max value.
mutate(Duration = if_else(x < threshold, 0L, n(), missing = 0L),
#+(Duration > 0) is used to turn values less than threshold to 0
Maxx = max(x, na.rm = TRUE) * +(Duration > 0)) %>%
ungroup() %>%
#Subtract current value with previous value
mutate(Delta = x - lag(x),
#Keep only those values that are first row in each group
Delta = replace(Delta, group == lag(group, default = first(group)) |
Duration == 0, 0)) %>%
select(-group)
# dates x Duration Maxx Delta
# <dttm> <dbl> <int> <dbl> <dbl>
# 1 2015-01-01 13:10:00 90 3 90 0
# 2 2015-01-01 13:10:01 90 3 90 0
# 3 2015-01-01 13:10:02 80 3 90 0
# 4 2015-01-01 13:10:03 65 0 0 0
# 5 2015-01-01 13:10:04 NA 0 0 0
# 6 2015-01-01 13:10:05 64 0 0 0
# 7 2015-01-01 13:10:06 71 2 75 7
# 8 2015-01-01 13:10:07 75 2 75 0
# 9 2015-01-01 13:10:08 62 0 0 0
#10 2015-01-01 13:10:09 63 0 0 0
#11 2015-01-01 13:10:10 74 1 74 11
Upvotes: 1