user3584057
user3584057

Reputation: 15

How do I subset every day except the last five days of zoo data?

I am trying to extract all dates except for the last five days from a zoo dataset into a single object.

This question is somewhat related to How do I subset the last week for every month of a zoo object in R?

You can reproduce the dataset with this code:

set.seed(123)
price <- rnorm(365)
data <- cbind(seq(as.Date("2013-01-01"), by = "day", length.out = 365), price)
zoodata <- zoo(data[,2], as.Date(data[,1]))

For my output, I'm hoping to get a combined dataset of everything except the last five days of each month. For example, if there are 20 days in the first month's data and 19 days in the second month's, I only want to subset the first 15 and 14 days of data respectively.

I tried using the head() function and the first() function to extract the first three weeks, but since each month will have a different amount of days according to month or leap year months, it's not ideal.

Thank you.

Upvotes: 0

Views: 688

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 269461

Here are a few approaches:

1) as.Date Let tt be the dates. Then we compute a Date vector the same length as tt which has the corresponding last date of the month. We then pick out those dates which are at least 5 days away from that:

tt <- time(zoodata)
last.date.of.month <- as.Date(as.yearmon(tt), frac = 1)
zoodata[ last.date.of.month - tt >= 5 ]

2) tapply/head For each month tapply head(x, -5) to the data and then concatenate the reduced months back together:

do.call("c", tapply(zoodata, as.yearmon(time(zoodata)), head, -5))

3) ave Define revseq which given a vector or zoo object returns sequence numbers in reverse order so that the last element corresponds to 1. Then use ave to create a vector ix the same length as zoodata which assigns such reverse sequence numbers to the days of each month. Thus the ix value for the last day of the month will be 1, for the second last day 2, etc. Finally subset zoodata to those elements corresponding to sequence numbers greater than 5:

revseq <- function(x) rev(seq_along(x))
ix <- ave(seq_along(zoodata), as.yearmon(time(zoodata)), FUN = revseq)
z <- zoodata[ ix > 5 ]

ADDED Solutions (1) and (2).

Upvotes: 1

luis_js
luis_js

Reputation: 611

Exactly the same way as in the answer to your other question:

Split dataset by month, remove last 5 days, just add a "-":

library(xts)
xts.data <- as.xts(zoodata)
lapply(split(xts.data, "months"), last, "-5 days")

And the same way, if you want it on one single object:

do.call(rbind, lapply(split(xts.data, "months"), last, "-5 days")) 

Upvotes: 1

Related Questions