rrg
rrg

Reputation: 675

Mixed date datafile

I'm input the following datafile: three pairs of date-price data (plus column index numb). The problem is that each price has different National holidays, thus UK and US prices eventually misalign. Is there a nice way to push the date into an xts/zoo format and fill with NA where the price doesn't exist (mkt is closed)?

ColNumb  Date1      UK2Y       Date2         US2Y       Date3       GBPUSD
1     09/07/2012   0.9330    09/07/2012    0.5210    09/07/2012    1.552554
2    10/07/2012    0.9401    10/07/2012    0.5235    10/07/2012    1.551831
3    11/07/2012    0.9122    11/07/2012    0.5003    11/07/2012    1.550388
4    12/07/2012    0.8732    12/07/2012    0.4805    12/07/2012    1.542972 

etc

UK2y <- as.xts(data[1:1033,1:2])
US2y <- as.xts(data[,3:4])
GBPUSD <- data[,5:6]

I have tried using {A <- strptime(UK2y$Date1, format = "%d/%m/%Y")}, but this leads to invalid zoo object. I end up with correct formatted dates in 'A' as POSIX class which fails to cbind with zoo ("error in structure"):

UK2y <- cbind(UK2y, A)

You see above there in an extra issue in that each paired-column is differing length. Some kind of "date match" function would mitigate, or perhaps there exists a soln in zoo/xts?

Upvotes: 0

Views: 50

Answers (1)

bVa
bVa

Reputation: 3938

Here is solution using merge :

# subset your data
UK2Y = data[,c("Date1", "UK2Y")]
US2Y = data[,c("Date2", "US2Y")]
GBPUSD = data[,c("Date3", "GBPUSD")]

# rename them to have the same Date column
names(UK2Y)[names(UK2Y) == "Date1"] <- "Date"
names(US2Y)[names(US2Y) == "Date2"] <- "Date"
names(GBPUSD)[names(GBPUSD) == "Date3"] <- "Date"

# Test: remove one data 
US2Y = US2Y[-4,] # market closed in US this day

# Merge the data frames
group = merge(UK2Y, US2Y, by = "Date", all = T) # "all = T" will show missing data as NA
group = merge(group, GBPUSD, by = "Date", all = T)

print(group)

    Date   UK2Y   US2Y   GBPUSD
1 2012-07-09 0.9330 0.5210 1.552554
2 2012-07-10 0.9401 0.5235 1.551831
3 2012-07-11 0.9122 0.5003 1.550388
4 2012-07-12 0.8732     NA 1.542972

EDIT

You can create an empty data frame with the correct dates generated in the order you want, and then merge:

UK2Y$Date = as.Date(UK2Y$Date)
US2Y$Date = as.Date(US2Y$Date)
GBPUSD$Date = as.Date(GBPUSD$Date)

# create empty dataframe with correct dates
dates = data.frame(Date = seq(as.Date("2012-07-01"), as.Date("2012-07-20"), by = '1 day'))

US2Y = US2Y[-4,]

group = merge(dates, UK2Y, by = "Date", all = T)
group = merge(group, US2Y, by = "Date", all = T)
group = merge(group, GBPUSD, by = "Date", all = T)

print(group)
     Date   UK2Y   US2Y   GBPUSD
1  2012-07-01     NA     NA       NA
2  2012-07-02     NA     NA       NA
3  2012-07-03     NA     NA       NA
4  2012-07-04     NA     NA       NA
5  2012-07-05     NA     NA       NA
6  2012-07-06     NA     NA       NA
7  2012-07-07     NA     NA       NA
8  2012-07-08     NA     NA       NA
9  2012-07-09 0.9330 0.5210 1.552554
10 2012-07-10 0.9401 0.5235 1.551831
11 2012-07-11 0.9122 0.5003 1.550388
12 2012-07-12 0.8732     NA 1.542972
13 2012-07-13     NA     NA       NA
14 2012-07-14     NA     NA       NA
15 2012-07-15     NA     NA       NA
16 2012-07-16     NA     NA       NA
17 2012-07-17     NA     NA       NA
18 2012-07-18     NA     NA       NA
19 2012-07-19     NA     NA       NA
20 2012-07-20     NA     NA       NA

Upvotes: 2

Related Questions