Reputation: 675
I'm input the following datafile: three pairs of date-price data (plus column index numb). The problem is that each price has different National holidays, thus UK and US prices eventually misalign. Is there a nice way to push the date into an xts/zoo format and fill with NA
where the price doesn't exist (mkt is closed)?
ColNumb Date1 UK2Y Date2 US2Y Date3 GBPUSD
1 09/07/2012 0.9330 09/07/2012 0.5210 09/07/2012 1.552554
2 10/07/2012 0.9401 10/07/2012 0.5235 10/07/2012 1.551831
3 11/07/2012 0.9122 11/07/2012 0.5003 11/07/2012 1.550388
4 12/07/2012 0.8732 12/07/2012 0.4805 12/07/2012 1.542972
etc
UK2y <- as.xts(data[1:1033,1:2])
US2y <- as.xts(data[,3:4])
GBPUSD <- data[,5:6]
I have tried using {A <- strptime(UK2y$Date1, format = "%d/%m/%Y")}
, but this leads to invalid zoo object. I end up with correct formatted dates in 'A' as POSIX class which fails to cbind
with zoo ("error in structure"):
UK2y <- cbind(UK2y, A)
You see above there in an extra issue in that each paired-column is differing length. Some kind of "date match" function would mitigate, or perhaps there exists a soln in zoo/xts?
Upvotes: 0
Views: 50
Reputation: 3938
Here is solution using merge
:
# subset your data
UK2Y = data[,c("Date1", "UK2Y")]
US2Y = data[,c("Date2", "US2Y")]
GBPUSD = data[,c("Date3", "GBPUSD")]
# rename them to have the same Date column
names(UK2Y)[names(UK2Y) == "Date1"] <- "Date"
names(US2Y)[names(US2Y) == "Date2"] <- "Date"
names(GBPUSD)[names(GBPUSD) == "Date3"] <- "Date"
# Test: remove one data
US2Y = US2Y[-4,] # market closed in US this day
# Merge the data frames
group = merge(UK2Y, US2Y, by = "Date", all = T) # "all = T" will show missing data as NA
group = merge(group, GBPUSD, by = "Date", all = T)
print(group)
Date UK2Y US2Y GBPUSD
1 2012-07-09 0.9330 0.5210 1.552554
2 2012-07-10 0.9401 0.5235 1.551831
3 2012-07-11 0.9122 0.5003 1.550388
4 2012-07-12 0.8732 NA 1.542972
EDIT
You can create an empty data frame with the correct dates generated in the order you want, and then merge:
UK2Y$Date = as.Date(UK2Y$Date)
US2Y$Date = as.Date(US2Y$Date)
GBPUSD$Date = as.Date(GBPUSD$Date)
# create empty dataframe with correct dates
dates = data.frame(Date = seq(as.Date("2012-07-01"), as.Date("2012-07-20"), by = '1 day'))
US2Y = US2Y[-4,]
group = merge(dates, UK2Y, by = "Date", all = T)
group = merge(group, US2Y, by = "Date", all = T)
group = merge(group, GBPUSD, by = "Date", all = T)
print(group)
Date UK2Y US2Y GBPUSD
1 2012-07-01 NA NA NA
2 2012-07-02 NA NA NA
3 2012-07-03 NA NA NA
4 2012-07-04 NA NA NA
5 2012-07-05 NA NA NA
6 2012-07-06 NA NA NA
7 2012-07-07 NA NA NA
8 2012-07-08 NA NA NA
9 2012-07-09 0.9330 0.5210 1.552554
10 2012-07-10 0.9401 0.5235 1.551831
11 2012-07-11 0.9122 0.5003 1.550388
12 2012-07-12 0.8732 NA 1.542972
13 2012-07-13 NA NA NA
14 2012-07-14 NA NA NA
15 2012-07-15 NA NA NA
16 2012-07-16 NA NA NA
17 2012-07-17 NA NA NA
18 2012-07-18 NA NA NA
19 2012-07-19 NA NA NA
20 2012-07-20 NA NA NA
Upvotes: 2