Aravind_005
Aravind_005

Reputation: 25

XTS:: Help me on the usage & differences between period.apply() & to.period()

I am learning time series analysis with R and came across these 2 functions while learning. I do understand that the output of both of these is a periodic data defined by the frequency of period and the only difference I can see is the OHLC output option in the to.period().

Other than the OHLC when a particular of these functions is to be used?

Upvotes: 1

Views: 197

Answers (1)

phiver
phiver

Reputation: 23598

to.period and all the to.minutes, to.weekly, to.quarterly are indeed meant for OHLC data.

If you take the function to.period it will take the open from the first day of the period, the close of the last day of the period and the highest high / lowest low of the specified period. These functions work very well together with the quantmod / tidyquant / quantstrat packages. See code example 1.

If you give the to.period non-OHLC data, but a timeseries with 1 data column, you still get a sort of OHLC back. See code example 2.

Now period.apply is is more interesting. Here you can supply your own functions to be applied on the data. Especially in combination with endpoints this can be a powerful function in timeseries data if you want to aggregate your function to different time periods. The index is mostly specified with endpoints, since with endpoints you can create the index you need to get to higher time levels (from day to week / etc etc). See code example 3 and 4.

Remember to use matrix functions with period.apply if you have more than 1 column of data since xts is basicly a matrix and an index. See code example 5.

More info on this data.camp course.

library(xts)

data(sample_matrix)
zoo.data <- zoo(rnorm(31)+10,as.Date(13514:13744,origin="1970-01-01"))


# code example 1
to.quarterly(sample_matrix)
        sample_matrix.Open sample_matrix.High sample_matrix.Low sample_matrix.Close
2007 Q1           50.03978           51.32342          48.23648            48.97490
2007 Q2           48.94407           50.33781          47.09144            47.76719

# same as to.quarterly
to.period(sample_matrix, period = "quarters")
        sample_matrix.Open sample_matrix.High sample_matrix.Low sample_matrix.Close
2007 Q1           50.03978           51.32342          48.23648            48.97490
2007 Q2           48.94407           50.33781          47.09144            47.76719


# code example 2
to.period(zoo.data, period = "quarters")
           zoo.data.Open zoo.data.High zoo.data.Low zoo.data.Close
2007-03-31      9.039875      11.31391     7.451139       10.35057
2007-06-30     10.834614      11.31391     7.451139       11.28427
2007-08-19     11.004465      11.31391     7.451139       11.30360

# code example 3 using base standard deviation in the chosen period
period.apply(zoo.data, endpoints(zoo.data, on = "quarters"), sd)
2007-03-31 2007-06-30 2007-08-19 
  1.026825   1.052786   1.071758 

# self defined function of summing x + x for the period
period.apply(zoo.data, endpoints(zoo.data, on = "quarters"), function(x) sum(x + x) )
2007-03-31 2007-06-30 2007-08-19 
 1798.7240  1812.4736   993.5729 

# code example 5
period.apply(sample_matrix, endpoints(sample_matrix, on = "quarters"), colMeans)
               Open     High      Low    Close
2007-03-31 50.15493 50.24838 50.05231 50.14677
2007-06-30 48.47278 48.56691 48.36606 48.45318

Upvotes: 1

Related Questions