R Curti
R Curti

Reputation: 31

creating date vector from existing date vector

          Date    Price      
       2006-01-03 12.02  
       2006-01-04 11.84  
       2006-01-05 11.83 
       ...  

      EXPIRATION DATES
       2006-01-18
       2006-02-15
       2006-03-22
       ...

Hello, I have a data frame of daily futures prices with corresponding dates. I also have a vector of all the relevant contract expiration dates for the futures prices.

The price column is the price for the contract expiring in the nearest month (12 month expiration cycle). For example, the 12.02 contract price on 2006-01-03 expires on 2006-01-18. I want to create a column that lists the relevant expiration date for each futures price so I can calculate days until expiration for each daily price. The logic would be:

all dates between 2006-01-03 and 2006-01-18 would have 2006-01-18 in the new expiration date column and so on for all the 127 expiration dates I have.

I tried playing around with mutate() and subset(), but I've had no luck. I assume this will be tedious, but just need someone to help me get started

Thanks

Upvotes: 0

Views: 92

Answers (2)

alistaire
alistaire

Reputation: 43334

Assuming the two data.frames are called df and df2 and dates are already formatted as such, with dplyr,

# add a row with a different expiration date to make sure it's working
df[4,] <- list(as.Date('2006-02-04'), 12)

library(dplyr)

df %>% rowwise() %>% 
    mutate(days_left = min(df2$EXPIRATION.DATES[df2$EXPIRATION.DATES > Date] - Date))
## Source: local data frame [4 x 3]
## Groups: <by row>
##   
## # A tibble: 4 x 3
##         Date Price      days_left
##       <date> <dbl> <S3: difftime>
## 1 2006-01-03 12.02        15 days
## 2 2006-01-04 11.84        14 days
## 3 2006-01-05 11.83        13 days
## 4 2006-02-04 12.00        11 days

or in base,

df$days_left <- lapply(df$Date, function(x){
    min(df2$EXPIRATION.DATES[df2$EXPIRATION.DATES > x] - x)
})

df
##         Date Price days_left
## 1 2006-01-03 12.02        15
## 2 2006-01-04 11.84        14
## 3 2006-01-05 11.83        13
## 4 2006-02-04 12.00        11

Subtracting dates calls difftime, which it may be worth calling explicitly so you can specify units:

# dplyr
df %>% rowwise() %>% 
    mutate(days_left = df2$EXPIRATION.DATES[df2$EXPIRATION.DATES > Date] %>% 
               difftime(Date, units = 'days') %>% 
               min())
# base
df$days_left <- lapply(df$Date, function(x){
    min(difftime(df2$EXPIRATION.DATES[df2$EXPIRATION.DATES > x], x, units = 'days'))
})

Depending on your data it may not make a difference, but it is a more robust approach than simple subtraction.

Upvotes: 1

Dominic Comtois
Dominic Comtois

Reputation: 10401

Disclaimer: I dislike pipes (I have my reasons) and when I can find a good "Base R" solution, I go for that one first. So here's my old fart solution.

I added more data to make sure it really works as expected.

# Create main dataframe
df1 <- read.table(text=
"Date Price
2006-01-03 12.02
2006-01-18 12.04
2006-01-22 12.05
2006-02-01 11.99
2006-02-16 11.84
2006-03-21 11.83
2006-03-22 11.90
2006-03-29 12.00
", head=T, stringsAsFactors=FALSE)

# Convert Date column to a proper Date-classed column
df1$Date <- as.Date(df1$Date)

# Generate an expiration dates vector
exp_dates <- as.Date(c("2006-01-18", "2006-02-15", "2006-03-22", "2006-04-18"))

# initialize df1$exp_dates
df1$exp_date <- NA
class(df1$exp_date) <- "Date"

# Loop over rows and find closest expir. date which is not past the date
for(i in 1:nrow(df1))
  df1$exp_date[i] <- exp_dates[which.max((df1$Date[i]-exp_dates) <= 0)]

(Yeah, I also loops, and I even like it! :^p)

df1

        Date Price   exp_date
1 2006-01-03 12.02 2006-01-18
2 2006-01-18 12.04 2006-01-18
3 2006-01-22 12.05 2006-02-15
4 2006-02-01 11.99 2006-02-15
5 2006-02-16 11.84 2006-03-22
6 2006-03-21 11.83 2006-03-22
7 2006-03-22 11.90 2006-03-22
8 2006-03-29 12.00 2006-04-18

Upvotes: 0

Related Questions