number8
number8

Reputation: 161

Removing dates with less than Full observations

I have an xts object that covers 169 days of high frequency 5 minute regular observations, but on some of the days there are missing observations, i.e less than 288 data points. How do I remove these so to have only days with full data points?

find days in data

ddx = endpoints(dxts, on="days");
days = format(index(dxts)[ddx], "%Y-%m-%d");


for (day in days) {
  x = dxts[day];
  cat('', day, "has", length(x), "records...\n");
}

I tried

RTAQ::exchangeHoursOnly(dxts, daybegin = "00:00:00", dayend = "23:55:00") 

but this still returned the full set

Thanks

Upvotes: 1

Views: 256

Answers (1)

GSee
GSee

Reputation: 49820

Split by days. Count the number of rows of each day, and only keep the ones that have more than 288 rows.

dxts <- .xts(rnorm(1000), 1:1000*5*60)
daylist <- lapply(split(dxts, "days"), function(x) {
    if(NROW(x) >= 288) x
})
do.call(rbind, daylist)

The above splits dxts by "days". Then, if the number of rows is greater than 288, it returns all the data for that day, otherwise, it returns NULL. So, daylist will be a list. It will have elements that are either an xts object, or NULL. The do.call part will call rbind on the list. It's like calling rbind(daylist[[1]], daylist[[2]], ..., daylist[[n]]) The NULLs won't be aggregated, so you'll be left with a single xts object that omits days with less than 288 rows.

Upvotes: 2

Related Questions