Reputation: 68426
I have a zoo object which consists of a timestamped (to the second) timeseries. The timeseries is irregular in that the time intervals between the values are not regularly spaced.
I would like to transform the irregularly spaced timeseries object into a regularly spaced one, where the time intervals between values is a constant - say 15 minutes, and are "real world" clock times.
Some sample data may help illustrate further
# Sample data
2011-05-05 09:30:04 101.32
2011-05-05 09:30:14 100.09
2011-05-05 09:30:19 99.89
2011-05-05 09:30:35 89.66
2011-05-05 09:30:45 95.16
2011-05-05 09:31:12 100.28
2011-05-05 09:31:50 100.28
2011-05-05 09:32:10 98.28
I'd like to aggregate them (using my custom function) for every specified time period (e.g. 30 second time bucket) such that the output looks like the table presented below.
The key is that I want to aggregate every 30 seconds by clock time NOT 30 seconds starting from my first observation time. Naturally, the first time bucket would be the first time bucket for which I have a recorded observation (i.e. row) in the data to be aggregated.
2011-05-05 09:30:00 101.32
2011-05-05 09:30:30 89.66
2011-05-05 09:31:00 100.28
In the example given, my custom aggregate function simply returns the first value in the 'set' of 'selected rows' to aggregate over.
Upvotes: 3
Views: 1260
Reputation: 269596
Read in the data and then aggregate it by minute:
Lines <- "2011-05-05 09:30:04 101.32
2011-05-05 09:30:14 100.09
2011-05-05 09:30:19 99.89
2011-05-05 09:30:35 89.66
2011-05-05 09:30:45 95.16
2011-05-05 09:31:12 100.28
2011-05-05 09:31:50 100.28
2011-05-05 09:32:10 98.28"
library(zoo)
library(chron)
toChron <- function(d, t) as.chron(paste(d, t))
z <- read.zoo(text = Lines, index = 1:2, FUN = toChron)
aggregate(z, trunc(time(z), "00:01:00"), mean)
The result is:
(05/05/11 09:30:00) (05/05/11 09:31:00) (05/05/11 09:32:00)
97.224 100.280 98.280
Upvotes: 5
Reputation: 263342
I hope we can assume this is in a zoo or xts object. If so then try this:
# First get a start for a set of intervals, need to use your tz
beg<- as.POSIXct( format(index(dat[1,]), "%Y-%m-%d %H:%M", tz="EST5EDT"))
# Then create a sequence of 30 second intervals
tseq <- beg+seq(0,4*30, by=30)
# Then this will creat a vector than you can use for your aggregation fun
findInterval(index(dat), tseq)
#[1] 1 1 1 2 2 3 4 5
# To find the first row in a subset of rows from tapply, try "[" with 1
tapply(dat, findInterval(index(dat), tseq), "[", 1)
# 1 2 3 4 5
#101.32 89.66 100.28 100.28 98.28
Upvotes: 2
Reputation: 16277
You should look at align.time
in xts
. It does something very close to what you want to achieve.
my.data <- read.table(text="date,x
2011-05-05 09:30:04,101.32
2011-05-05 09:30:14,100.09
2011-05-05 09:30:19,99.89
2011-05-05 09:30:35,89.66
2011-05-05 09:30:45,95.16
2011-05-05 09:31:12,100.28
2011-05-05 09:31:50,100.28
2011-05-05 09:32:10,98.28", header=TRUE, as.is=TRUE,sep = ",")
my.data <- xts(my.data[,2],as.POSIXlt(my.data[,1],format="%Y-%m-%d %H:%M:%S"))
library(xts)
res <-align.time(my.data,30)
res[!duplicated(index(res)),]
[,1]
2011-05-05 09:30:30 101.32
2011-05-05 09:31:00 89.66
2011-05-05 09:31:30 100.28
2011-05-05 09:32:00 100.28
2011-05-05 09:32:30 98.28
You can lag the time series by 30 seconds if it makes the interpretation clearer.
Upvotes: 0
Reputation: 13932
I would simply truncate the times towards your interval, so assuming t
is the time (use as.POSIXct
if it's not)
bucket = t - as.numeric(t) %% 30
then you can aggregate over bucket
, like aggregate(value, list(bucket), sum)
(I don't use zoo
so this is with pure R)
Upvotes: 1