nil
nil

Reputation: 45

Merge irregular time series data sets

I am trying to merge several data sets. However, each has irregular hourly time stamps. My goal is merging the data if they are in the same hour interval, and fill a regular time series time table. As an example you can see two data sets:

 x <- structure(list(Date = structure(1:5, .Label = c("09.09.2011 21:54", 
"09.09.2011 22:59", "09.10.2011 00:04", "09.10.2011 01:09", "09.10.2011 02:14"
), class = "factor"), hexane = c(0, 0, 0, 0, 0), benzene = structure(c(1L, 
2L, 4L, 3L, 5L), .Label = c("0", "4.4", "4.7", "6.3", "7.7"), class = "factor"), 
    toluene = c(2.2, 2.6, 3.5, 2.7, 3.1)), .Names = c("Date", 
"hexane", "benzene", "toluene"), row.names = c(NA, 5L), class = "data.frame")
> 

    y <- structure(list(Date = structure(1:5, .Label = c("09.09.2011 21:54", 
"09.09.2011 22:59", "09.10.2011 00:04", "09.10.2011 01:09", "09.10.2011 02:14"
), class = "factor"), ethane = c(14.4, 868.9, 547, 491.4, 56.1
), propane = c(6.4, 32.1, 23.7, 22.8, 7.2), isobutane = c(1.7, 
2, 1.8, 1.3, 1.1), n.butane = c(3.1, 3, 3.7, 4.3, 2.9), isopentane = c(5.6, 
3, 2.4, 3.4, 2.7), n.pentane = c(1.4, 2.4, 2.3, 2.4, 2.3)), .Names = c("Date", 
"ethane", "propane", "isobutane", "n.butane", "isopentane", "n.pentane"
), row.names = c(NA, 5L), class = "data.frame")

na.fill (x, NA)
na.fill (y, NA

)

#identify "Date" column

x <- as.POSIXct(x$Date,format='%m.%d.%y %H:%M')
y <- as.POSIXct(y$Date,format='%m.%d.%y %H:%M')

#merge two data sets

merged_data <- merge.data.frame(x, y, by='Date', all=TRUE)

However, the Date column on output file "merged_data" is filled with NA. I need an hourly regular time stamp on Date column.

The aimed output file

Upvotes: 0

Views: 212

Answers (1)

GGamba
GGamba

Reputation: 13680

Your merged_date$Date is NA because the cast to POSIXct fails. There are two step to obtain your result.

  1. Cast the Date column of your dfs as an actual Date object
  2. Round (or truncate) to the hour and join the two dfs

Cast as Date

Several way to do this:

as.POSIXct

x$Date <- as.POSIXct(x$Date, format = '%m.%d.%Y %H:%M')

Note the capital Y for the 4-digit year

strptime

Almost same as above

x$Date <- strptime(x$Date, format = '%m.%d.%Y %H:%M')

anytime

Use the awesome anytime package -saved me so much headache-

x$Date <- anytime(x$Date)

Round and join

x$Date <- anytime(x$Date)
y$Date <- anytime(y$Date)

x$Date <- format(x$Date, '%m/%d/%y %H')
y$Date <- format(y$Date, '%m/%d/%y %H')

merge(x, y, by = Date)

Date          hexane benzene toluene ethane propane isobutane n.butane isopentane n.pentane
# 09/09/11 21      0       0     2.2   14.4     6.4       1.7      3.1        5.6       1.4
# 09/09/11 22      0     4.4     2.6  868.9    32.1       2.0      3.0        3.0       2.4
# 09/10/11 00      0     6.3     3.5  547.0    23.7       1.8      3.7        2.4       2.3
# 09/10/11 01      0     4.7     2.7  491.4    22.8       1.3      4.3        3.4       2.4
# 09/10/11 02      0     7.7     3.1   56.1     7.2       1.1      2.9        2.7       2.3

Hope this helps

Upvotes: 1

Related Questions