Reputation: 303
I want to calculate a moving sum of one column (populated with ones and zeroes), but only when the value in a corresponding column (time) is within a (moving) range of values.
My data looks like this:
values <- c(1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0)
seconds <- c(0.0, 1.0, 2.5, 3.0, 5.5, 6.0, 6.5, 7.0, 8.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.5, 16.0, 17.0, 18.0, 19.0, 20.0)
data <- data.frame(values, seconds)
Say I want to sum every 5 seconds worth of data in the 'values' column. Then my first 5-second sum (seconds >=0 & seconds <= 5) would be:
1 (because it corresponds to a 'seconds', 0.0, within the interval of interest)
+
0 (corresponds to 1.0 in 'seconds')
+
0 (2.5)
+
0 (3.0)
= 1
STOPs here because the next value (1) corresponds to 5.5 seconds, outside of the interval.
The next 5-second interval (seconds >= 1 & seconds <= 6) would equal:
0 + 0 + 0 + 1 + 1 = 2
3rd interval:
(seconds >= 2.5 & seconds <= 7.5) = 0 + 0 + 1 + 1 + 0 + 1 = 3
and so on.
I'm an R noob, so this is the method I'm using to calculate it (and it is super slow, so I know there must be a better way):
for(i in 1:20){movsum[i] <- sum(subset(data, seconds >= (seconds[i] - 5.0) & seconds <= seconds[i])$values)}
Thanks for your help. Let me know if there's anything I should clarify.
Upvotes: 1
Views: 689
Reputation: 67818
You may try some functions from the zoo
package:
library(zoo)
# convert your data to a zoo time series
z <- read.zoo(data, index = "seconds")
# create an empty, regular time series,
# which contains the full time range, in steps of 0.5 sec
z0 <- zoo(, seq(from = start(z), to = end(z), by = 0.5))
# 'expand' the irregular, original data to a regular series, by merging it with z0
z2 <- merge(z, z0)
# apply the desired function (sum) to a rolling window of width 11
# (number of observations in each window)
# move the time frame in steps of 2 (by = 2) which correspond to 1 sec
# use partial = TRUE, to allow the window to pass outside the time range
rollapply(z2, width = 11, by = 2, FUN = sum, na.rm = TRUE,
align = "left", partial = TRUE)
# 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
# 1 2 3 3 3 3 2 2 1 2 2 3 3 2 2 1 0 0 0 0 0
Upvotes: 3
Reputation: 92300
Here's a possible data.table::foverlaps
solution. The idea here is to create 5 seconds interval look up table and then lookup within data
which values fall in each interval.
Choose an interval
int <- 5 ## 5 seconds
The load the package, add additional (identical) column to data
in order to set boundaries, create a new data set which will have the desired boundaries per row, run foverlaps
, key data
in order to enable the binary join, find the corresponding values in data$values
and sum them per each interval, something like the following seem to work
library(data.table)
setkey(setDT(data)[, seconds2 := seconds], seconds, seconds2)
lookup <- data[, .(seconds, seconds2 = seconds + int)]
res <- foverlaps(lookup, data, which = TRUE)[, values := data$values[yid]]
res[, .(SumValues = sum(values)), by = .(SecInterval = xid)]
# SecInterval SumValues
# 1: 1 1
# 2: 2 2
# 3: 3 3
# 4: 4 3
# 5: 5 3
# 6: 6 2
# 7: 7 1
# 8: 8 2
# 9: 9 1
# 10: 10 2
# 11: 11 3
# 12: 12 3
# 13: 13 2
# 14: 14 2
# 15: 15 1
# 16: 16 0
# 17: 17 0
# 18: 18 0
# 19: 19 0
# 20: 20 0
Upvotes: 3