Reputation: 2375
I have two data frames. The one consists of three variables, namely "date", "strike" and "vol" with 20 observations a day, 100 a month and 1200 a year (in trading days), which looks like this
Date Price Vol
2008-09-01 20 0.2
2008-09-01 30 0.5
...
So for each month I have certain values for price and vol, ranging from 10 to 40, 0.1 to 0.7, respectively.
The second one includes interpolated values from the first one. So I do not have the date anymore, however small steps for the other variables:
Price Vol
20 0.2
21 0.21
22 0.24
30 0.5
So, while one frame shows values in a discrete time, the other one is more or less of continuous nature.
Now my question: how is it possible to tell R to merge the second data frame into the first one, taking over the dates for the continuous prices/vols between the two discrete ones, to get to something like this:
Date Price Vol
2008-09-01 20 0.2
2008-09-01 21 0.21
2008-09-01 22 0.24
...
2008-09-01 30 0.5
I just cannot figure out how to do it. I always ended up with NA values for the dates which are no longer in ascending order.
Thank you very much for your support
Dani
Upvotes: 4
Views: 2534
Reputation: 10092
I totally missed the point with the first post. This one does the date. But I agree with Shane that unless some downstream function requires data frames, then a time series is a good idea.
A <- data.frame(date=rep("2001-05-25", 2), price=c(20, 30), vol=c(0.2, 0.5))
B <- data.frame(price=seq(min(A$price), max(A$price), by=1))
C <- merge(A, B, all=TRUE)
index <- which(!is.na(C$vol))
for (i in seq(nrow(A))[-1]) {
C$date[index[i-1]:index[i]] <- rep(A$date[i-1], A$price[i] - A$price[i-1] + 1)
C$vol[index[i-1]:index[i]] <- seq(A$vol[i-1], A$vol[i], length=(A$price[i] - A$price[i-1] + 1))
}
ans <- C[, c(2, 1, 3)]
ans
date price vol
1 2001-05-25 20 0.20
2 2001-05-25 21 0.23
3 2001-05-25 22 0.26
4 2001-05-25 23 0.29
5 2001-05-25 24 0.32
6 2001-05-25 25 0.35
7 2001-05-25 26 0.38
8 2001-05-25 27 0.41
9 2001-05-25 28 0.44
10 2001-05-25 29 0.47
11 2001-05-25 30 0.50
Upvotes: 2
Reputation: 100164
First, use a time series class (e.g. zoo
or xts
).
Your second interpolated time series should still have a timestamp, even if it is hourly or every minute, etc. Use merge
to bring them together, then use na.locf
to carry the values forward from the lower frequency time series.
Here's an example:
ts1 <- zoo(1:5, as.POSIXct(as.Date("2010-10-01") + 1:5))
ts2 <- zoo(1:(5 * 24), as.POSIXct("2010-10-01 00:00:00") + (1:(5 * 24) * 3600))
na.locf(merge(ts1, ts2))
Upvotes: 4