Reputation: 59
Can I perform autocorrelation / lag analysis on a zoo object in R with non-regular time steps? If so, how?
The only other post I could find here dealt with regular time series. I have a sequence of observations taken at irregular time steps. For example, (t,y) = (0,2668), (36.62,2723), (42,2723),...
where
t
is the time in hours, and y
is the (categorical*) observation. ... *edited from original post I would like to look for lag correlations daily (lag = 24) and weekly (lag = 168) to see whether certain categories of observation repeat at / near these lag intervals. Is there a way to do this in R? I created a zoo object for my data but have been unable to find any documentation concerning how to do this.
Upvotes: 1
Views: 1762
Reputation: 56915
You can use aggregate
to convert your data into daily & weekly intervals, and then calculate the autocorrelation with whatever function does it for regular time series (say acf
). e.g.:
# make a data set to play with
library(zoo)
ts <- sort(runif(100)*168*3) # 100 observations over 3 weeks
ys <- runif(100) # y values
z <- zoo(ys, order.by=ts)
# ** convert to daily/weekly. ?aggregate.zoo
# NOTE: can use ts instead of index(z)
z.daily <- aggregate(z,index(z) %/% 24) # has 21 elements (one per day)
z.weekly <- aggregate(z,index(z) %/% 168) # has 3 elements (one per week)
# Now compute correlation, lag 1 (index in z.daily/weekly)
daily.acf <- acf(z.daily, lag.max=1)[1]
weekly.acf <- acf(z.weekly, lag.max=1)[1]
The aggregate
converts z
to daily or weekly data where you sum all occurences for each day/week. It does the grouping by looking at index(z) %/% 24
(or 168) which is the integer part of the hour of observation divided by 24 (ie, the day it occurs).
Then the acf
function calculates autocorrelation (with the lag
being on indices of the vector, not on time).
I don't really know much about statistics, and one thing I noticed was that if you do:
weekly.acf <- acf(z.daily,lag.max=7)[7]
you get a different answer from when you calculate autocorrelation from z.weekly
, because it's doing autocorrelation on daily data with a lag of 7 as opposed to weekly data with a lag of 1 -- so I'm not sure if what I'm doing is actualy what you want.
Upvotes: 2