greedIsGood
greedIsGood

Reputation: 19

How to group/sum xts time series by whole seconds

Any package can do the following example? I have big xts dataset in millisecond level.
Can I sum the coredata up to second level?
The time is index.

For example:

enter image description here

The ideal result is:

enter image description here

Upvotes: 0

Views: 425

Answers (2)

FXQuantTrader
FXQuantTrader

Reputation: 6891

library(xts)

# To see subsecond timestamp values:
options(digits.secs = 3)

set.seed(1)
x <- xts(rnorm(10), as.POSIXct("2018-11-01 00:00:00.3") + seq(0.05, 3, length.out = 10))
# [,1]
# 2018-11-01 00:00:00.349 -0.6264538
# 2018-11-01 00:00:00.677  0.1836433
# 2018-11-01 00:00:01.005 -0.8356286
# 2018-11-01 00:00:01.333  1.5952808
# 2018-11-01 00:00:01.661  0.3295078
# 2018-11-01 00:00:01.988 -0.8204684
# 2018-11-01 00:00:02.316  0.4874291
# 2018-11-01 00:00:02.644  0.7383247
# 2018-11-01 00:00:02.972  0.5757814
# 2018-11-01 00:00:03.299 -0.3053884

To sum values within second points, you can use xts's period.apply, combined with finding the index of the last observation within each second using xts's endpoints:

ep <- endpoints(x, on = "seconds", k = 1)

xs <- period.apply(x, ep, sum)

"whole second" timestamps may not be desirable, which you can then fix using xts's align.time, which rounds to the end of the second (important to round to the end of the second to avoid generating look forward bias in your data)

xs
#                               [,1]
# 2018-11-01 00:00:00.677 -0.4428105
# 2018-11-01 00:00:01.988  0.2686916
# 2018-11-01 00:00:02.972  1.8015351
# 2018-11-01 00:00:03.299 -0.3053884

xs <- align.time(xs, n = 1)
xs
#                           [,1]
# 2018-11-01 00:00:01 -0.4428105
# 2018-11-01 00:00:02  0.2686916
# 2018-11-01 00:00:03  1.8015351
# 2018-11-01 00:00:04 -0.3053884

Upvotes: 0

AkselA
AkselA

Reputation: 8846

You can aggregate by using aggregate.zoo() from the zoo package.

library(xts)
library(zoo)

# A simple example time series
xx <- xts(1:20, as.POSIXct((1:20)/5, origin="2000-01-01"))

aggregate(xx, as.POSIXct(trunc(time(xx), "sec")), sum)

# 2000-01-01 01:00:00 10
# 2000-01-01 01:00:01 35
# 2000-01-01 01:00:02 60
# 2000-01-01 01:00:03 85
# 2000-01-01 01:00:04 20

Upvotes: 1

Related Questions