Reputation: 45
I am trying to merge several data sets. However, each has irregular hourly time stamps. My goal is merging the data if they are in the same hour interval, and fill a regular time series time table. As an example you can see two data sets:
x <- structure(list(Date = structure(1:5, .Label = c("09.09.2011 21:54",
"09.09.2011 22:59", "09.10.2011 00:04", "09.10.2011 01:09", "09.10.2011 02:14"
), class = "factor"), hexane = c(0, 0, 0, 0, 0), benzene = structure(c(1L,
2L, 4L, 3L, 5L), .Label = c("0", "4.4", "4.7", "6.3", "7.7"), class = "factor"),
toluene = c(2.2, 2.6, 3.5, 2.7, 3.1)), .Names = c("Date",
"hexane", "benzene", "toluene"), row.names = c(NA, 5L), class = "data.frame")
>
y <- structure(list(Date = structure(1:5, .Label = c("09.09.2011 21:54",
"09.09.2011 22:59", "09.10.2011 00:04", "09.10.2011 01:09", "09.10.2011 02:14"
), class = "factor"), ethane = c(14.4, 868.9, 547, 491.4, 56.1
), propane = c(6.4, 32.1, 23.7, 22.8, 7.2), isobutane = c(1.7,
2, 1.8, 1.3, 1.1), n.butane = c(3.1, 3, 3.7, 4.3, 2.9), isopentane = c(5.6,
3, 2.4, 3.4, 2.7), n.pentane = c(1.4, 2.4, 2.3, 2.4, 2.3)), .Names = c("Date",
"ethane", "propane", "isobutane", "n.butane", "isopentane", "n.pentane"
), row.names = c(NA, 5L), class = "data.frame")
na.fill (x, NA)
na.fill (y, NA
)
#identify "Date" column
x <- as.POSIXct(x$Date,format='%m.%d.%y %H:%M')
y <- as.POSIXct(y$Date,format='%m.%d.%y %H:%M')
#merge two data sets
merged_data <- merge.data.frame(x, y, by='Date', all=TRUE)
However, the Date column on output file "merged_data" is filled with NA. I need an hourly regular time stamp on Date column.
Upvotes: 0
Views: 212
Reputation: 13680
Your merged_date$Date is NA because the cast to POSIXct fails. There are two step to obtain your result.
Date
column of your dfs as an actual Date objectSeveral way to do this:
x$Date <- as.POSIXct(x$Date, format = '%m.%d.%Y %H:%M')
Note the capital Y for the 4-digit year
Almost same as above
x$Date <- strptime(x$Date, format = '%m.%d.%Y %H:%M')
Use the awesome anytime
package -saved me so much headache-
x$Date <- anytime(x$Date)
x$Date <- anytime(x$Date)
y$Date <- anytime(y$Date)
x$Date <- format(x$Date, '%m/%d/%y %H')
y$Date <- format(y$Date, '%m/%d/%y %H')
merge(x, y, by = Date)
Date hexane benzene toluene ethane propane isobutane n.butane isopentane n.pentane
# 09/09/11 21 0 0 2.2 14.4 6.4 1.7 3.1 5.6 1.4
# 09/09/11 22 0 4.4 2.6 868.9 32.1 2.0 3.0 3.0 2.4
# 09/10/11 00 0 6.3 3.5 547.0 23.7 1.8 3.7 2.4 2.3
# 09/10/11 01 0 4.7 2.7 491.4 22.8 1.3 4.3 3.4 2.4
# 09/10/11 02 0 7.7 3.1 56.1 7.2 1.1 2.9 2.7 2.3
Hope this helps
Upvotes: 1