P. Garnry
P. Garnry

Reputation: 364

How to handle 'Series contains non-leading NAs' in TTR library with xts objects?

I often run into the same issue of how to handle NA values when modelling quantitative trading models. The example below is about a stock with EOD data since 1997-01-01 stored in a xts object with four columns named "High","Low","Close","Volume". The data is from Bloomberg. When I want to calculate rolling 20-day volume the error message occurs:

SMA(stock$Volume, 20)
Error in runSum(x, n) : Series contains non-leading NAs  

I quickly located the problem (which I knew was NA values since I have tried this a 1000 times) and found the two days where volume data is missing. I have reproduced those days' data below. As a quick observation the SMA, EMA etc. functions in TTR cannot handle NAs if they are preceded by numbers and followed by numbers.

stock <- as.xts(matrix(c(94.46,92.377,94.204,NA,71.501,70.457,70.979,NA), 2, 4,
  byrow = TRUE, dimnames = list(NULL, c("High","Low","Close","Volume"))),
  as.Date(c("1998-07-07", "1999-02-22")))

What is the best way to handle this issue? Is it to store the stock$Volume as a temporary object where NA values are removed and then calculate the rolling volume and the merge it back in with merge.xts while adding the fill = NA so NA values are inserted again? But is that correct since you take the last 20 trading days and not just the 19 available in the 20-day window?

It is my hope that some sort of "best practice" can be the outcome of this post as I assume this issue also happens for other R-users in finance whether they get their data from Bloomberg, Yahoo Finance or another source.

Upvotes: 10

Views: 14337

Answers (3)

Boon Hong
Boon Hong

Reputation: 27

Try na.omit(), but have in mind that removing missing values introduces a systematic bias.

Upvotes: 0

Nord Farsi
Nord Farsi

Reputation: 41

Take your initial time series containing NAs, for example a.ts approximate the NAs by using a na.approx a generic functions for replacing each NA with interpolated values (more details in the zoo package document)

b.ts=na.approx(a.ts)

b.ts is the time

Upvotes: 4

Joshua Ulrich
Joshua Ulrich

Reputation: 176668

I don't know about "best practice" but one alternative might be what are called "inhomogeneous time series operators", as presented in Operators on Inhomogeneous Time Series.

This type of question is a good fit for the Quantitative Finance stack exchange site (e.g. see How to update an exponential moving average with missing values?).

Upvotes: 3

Related Questions