Merging two dataframes with irregular timestamp columns

Question

I have a equity related data sample with prices that are date/time stamped at irregular, random intervals in the seconds, called ESH5 and ESM5

I would like to generate another full data.frame, with a date/time column that increments the time by seconds, and fill in with the values from ESH5, ESM5. Any values will become the 'latest price' which will carry over to the next time interval unless a value exists for the matching time in ESH5 or ESM5

For example, ESH5

    Date                 Price    Type
    22/10/2015 9:00:00   50.10    Bid
    22/10/2015 9:00:02   50.12    Ask
    22/10/2015 9:00:06   50.10    Trade

ESM5

    Date                 Price    Type
    22/10/2015 9:00:01   50.09    Bid
    22/10/2015 9:00:02   50.11    Ask
    22/10/2015 9:00:04   50.09    Trade

I am looking to generate a full data.frame like

    Date                 ESH5.Bid   ESH5.Ask   ESH5.Trade   ESM5.Bid   ESM5.Ask   ESM5.Trade
    22/10/2015 9:00:00      50.10         NA           NA        NA          NA          NA
    22/10/2015 9:00:01      50.10         NA           NA     50.09          NA          NA
    22/10/2015 9:00:02      50.10       50.12          NA     50.09       50.11          NA
    22/10/2015 9:00:03      50.10       50.12          NA     50.09       50.11          NA
    22/10/2015 9:00:04
    22/10/2015 9:00:05
    22/10/2015 9:00:06

Currently, I am generating the table with a for loop and if-else statements. I pregenerate NA empty data frame with the regular timestamp increments, the latest updated bid, ask, trade, then run a conditional check for the same time to fill the table.

My current code works, however the loop takes quite a while to process (10s of minutes). Are there any internal functions inbuilt in R I can use for this search and replace and carry over-like features?

Apologies if this is a bit hard to follow. Thank you.

RHA · Accepted Answer

I think what you want to do requires several steps:

First, create the date column with seq, as indicated bu @akrun

Secondly rebuild the structure of your data. This can be done in several ways, but i think the dcast function from the reshape2 package is best:

ESH5c <- dcast(ESH5, Date ~ Type, value.var='Price')
ESM5c <- dcast(ESM5, Date ~ Type, value.var='Price')

And the last step is to merge these new data with your date vector.

Merging two dataframes with irregular timestamp columns

Answers (2)

Related Questions