Reputation: 31
Date Price
2006-01-03 12.02
2006-01-04 11.84
2006-01-05 11.83
...
EXPIRATION DATES
2006-01-18
2006-02-15
2006-03-22
...
Hello, I have a data frame of daily futures prices with corresponding dates. I also have a vector of all the relevant contract expiration dates for the futures prices.
The price column is the price for the contract expiring in the nearest month (12 month expiration cycle). For example, the 12.02 contract price on 2006-01-03 expires on 2006-01-18. I want to create a column that lists the relevant expiration date for each futures price so I can calculate days until expiration for each daily price. The logic would be:
all dates between 2006-01-03 and 2006-01-18 would have 2006-01-18 in the new expiration date column and so on for all the 127 expiration dates I have.
I tried playing around with mutate() and subset(), but I've had no luck. I assume this will be tedious, but just need someone to help me get started
Thanks
Upvotes: 0
Views: 92
Reputation: 43334
Assuming the two data.frames are called df
and df2
and dates are already formatted as such, with dplyr,
# add a row with a different expiration date to make sure it's working
df[4,] <- list(as.Date('2006-02-04'), 12)
library(dplyr)
df %>% rowwise() %>%
mutate(days_left = min(df2$EXPIRATION.DATES[df2$EXPIRATION.DATES > Date] - Date))
## Source: local data frame [4 x 3]
## Groups: <by row>
##
## # A tibble: 4 x 3
## Date Price days_left
## <date> <dbl> <S3: difftime>
## 1 2006-01-03 12.02 15 days
## 2 2006-01-04 11.84 14 days
## 3 2006-01-05 11.83 13 days
## 4 2006-02-04 12.00 11 days
or in base,
df$days_left <- lapply(df$Date, function(x){
min(df2$EXPIRATION.DATES[df2$EXPIRATION.DATES > x] - x)
})
df
## Date Price days_left
## 1 2006-01-03 12.02 15
## 2 2006-01-04 11.84 14
## 3 2006-01-05 11.83 13
## 4 2006-02-04 12.00 11
Subtracting dates calls difftime
, which it may be worth calling explicitly so you can specify units:
# dplyr
df %>% rowwise() %>%
mutate(days_left = df2$EXPIRATION.DATES[df2$EXPIRATION.DATES > Date] %>%
difftime(Date, units = 'days') %>%
min())
# base
df$days_left <- lapply(df$Date, function(x){
min(difftime(df2$EXPIRATION.DATES[df2$EXPIRATION.DATES > x], x, units = 'days'))
})
Depending on your data it may not make a difference, but it is a more robust approach than simple subtraction.
Upvotes: 1
Reputation: 10401
Disclaimer: I dislike pipes (I have my reasons) and when I can find a good "Base R" solution, I go for that one first. So here's my old fart solution.
I added more data to make sure it really works as expected.
# Create main dataframe
df1 <- read.table(text=
"Date Price
2006-01-03 12.02
2006-01-18 12.04
2006-01-22 12.05
2006-02-01 11.99
2006-02-16 11.84
2006-03-21 11.83
2006-03-22 11.90
2006-03-29 12.00
", head=T, stringsAsFactors=FALSE)
# Convert Date column to a proper Date-classed column
df1$Date <- as.Date(df1$Date)
# Generate an expiration dates vector
exp_dates <- as.Date(c("2006-01-18", "2006-02-15", "2006-03-22", "2006-04-18"))
# initialize df1$exp_dates
df1$exp_date <- NA
class(df1$exp_date) <- "Date"
# Loop over rows and find closest expir. date which is not past the date
for(i in 1:nrow(df1))
df1$exp_date[i] <- exp_dates[which.max((df1$Date[i]-exp_dates) <= 0)]
(Yeah, I also loops, and I even like it! :^p)
df1
Date Price exp_date
1 2006-01-03 12.02 2006-01-18
2 2006-01-18 12.04 2006-01-18
3 2006-01-22 12.05 2006-02-15
4 2006-02-01 11.99 2006-02-15
5 2006-02-16 11.84 2006-03-22
6 2006-03-21 11.83 2006-03-22
7 2006-03-22 11.90 2006-03-22
8 2006-03-29 12.00 2006-04-18
Upvotes: 0