andrewH
andrewH

Reputation: 2321

R: How can I create a time series object with multiple time scales in an existing R time series class?

I wish to create a time series object in one of the existing R time series types in which observations are marked with two kinds of time: ordinal time, with an index of consecutive integers which are greater, less, or equal as an observation is before, after, or at the same time as another; and cardinal time, consisting of standard dates (days, at least for starters).

My actual data comes in three sub-day periods, each of which can have multiple (numerical, not count) observations or zero observations. My order index, which is my main measure of time, treats observations from the same period as at the same time, and removes periods where no events occur. I also want dates because I want to test for calendar effects.

Here is a toy data set:

set.seed(1)
dates. <- seq(as.Date("2020-03-03"), by = "day", length.out=8)[c(1, 1, 2, 3, 3, 3, 4, 4, 4, 8, 8, 8, 8)]
index. <- c(1, 2, 3, 4, 4, 5, 6, 7, 8, 9, 9, 9, 10)
dat. <- rnorm(13)  
tib_ts <- tibble(dates., index., dat.)
tib_ts
# A tibble: 13 x 3
   dates.     index.   dat.
   <date>      <dbl>  <dbl>
 1 2020-03-03      1 -0.626
 2 2020-03-03      2  0.184
 3 2020-03-04      3 -0.836
 4 2020-03-05      4  1.60 
 5 2020-03-05      4  0.330
 6 2020-03-05      5 -0.820
 7 2020-03-06      6  0.487
 8 2020-03-06      7  0.738
 9 2020-03-06      8  0.576
10 2020-03-10      9 -0.305
11 2020-03-10      9  1.51 
12 2020-03-10      9  0.390
13 2020-03-10     10 -0.621

I’ve tried to figure out how to do this with zoo, xts, and tsibble, and have run into two problems. First, although each observation has its own date and index value, both dates and index values can be associated with multiple observations that occur at times indistinguishable (at that time scale). The second is that I want to use the usual array of time tools on sometimes one time measure and sometimes another, and have not found a way to switch back and forth between measures.

However, I am convinced that an implementation must already be out there in an existing class or package, as there are common problems with the same or similar time structure. Take, for example, large casualty losses by dollar value, aggregated by hour or day or month. At any aggregation scale there will be periods with no losses and periods with multiple losses. Similarly with daily high and low values, whether of temperature or stock price. You know they come after yesterday's high and low and before tomorrow's but you done know which came first, or whether they are two minutes or 20 hours apart. Stock price data often treats Monday as if it were the day after Friday, because no transactions take place in the interim. And so forth.

Upvotes: 0

Views: 345

Answers (1)

G. Grothendieck
G. Grothendieck

Reputation: 269371

Suppose the values are values and the numbers are ix and the dates are d. Then assign the numbers to the names of d and create a zoo object using that:

library(zoo)

values <- 1:4

ix <- c(0, 3, 4, 6)
d <- as.Date("2000-01-01") + ix
names(d) <- ix

z <- zoo(values, d)
time(z)
##            0            3            4            6 
## "2000-01-01" "2000-01-04" "2000-01-05" "2000-01-07" 

In this example the difference between any two numbers and the difference between the corresponding dates is the same but that is not necessary. The numbers could be unrelated to the dates.

If the numbers do have the above relationship to the dates then another possibility is to just use the dates and then derive the numbers using this when you need them:

zz <- zoo(values, unname(d))
as.numeric(time(zz) - time(zz)[1]) # derive numbers from dates
## [1] 0 3 4 6

Upvotes: 1

Related Questions