nofunsally
nofunsally

Reputation: 2091

Merge dataframes using part of timestamp information

Hello I have two dataframes each with a different timestamp format. DF1 is organized as m/d/YYYY H:M and DF2 as YYYYmmdd. I would like to merge these data frames based on their timestamp by date (month, day, and year) I am not concerned about the time for the merge. When I try to merge the result is a dataframe that consists of NA in for the ts column. I think this is because the time is not matching up, whilst DF2 does not display the time, I think it is recognized as 00:00:00 and therefore the merge doesn't work. Is there a way to ignore the time from the timestamp?

I have tried the following.

DF1 <- read.csv("DF1.csv",header=T)
DF2 <- read.csv("DF2.csv",header=T)

DF1$ts <- strptime(DF1$ts, "%m/%d/%Y %H:%M)
DF2$ts <- strptime(DF2$ts, "%Y%m%d")

DF1$ts <- as.POSIXct(DF1$ts)
DF2$ts <- as.POSIXct(DF2$ts)

try <- merge(DF1, DF2, by="ts")

#dput DF1 (sample)
DF1 <- structure(list(LENGTH = c(40, 33.6, 30, 30, 40, 43.5, 50, 40, 
62.5, 30), pLENGTH = c(0, 0.297619048, 0, 1, 0, 0, 1, 0, 0.16, 
0.666666667), LENGTH2 = c(30.7, NA, 30, NA, 25.1, 39.6, 50, 
NA, NA, NA), BURN = c(1L, NA, 1L, NA, 1L, 1L, 1L, NA, NA, 
NA), pBURN = c(0.7675, NA, 1, NA, 0.6275, 0.910344828, 1, NA, 
NA, NA), ts = structure(c(1344916800, NA, 1339819200, NA, 
1339646400, 1345003200, 1340769600, NA, NA, NA), class = c("POSIXct", 
"POSIXt"), tzone = "")), .Names = c("LENGTH", "pLENGTH", "LENGTH2", 
"BURN", "pBURN", "ts"), row.names = c(135L, 242L, 154L, 
56L, 265L, 151L, 328L, 160L, 90L, 364L), class = "data.frame")

#dput DF2 (sample)
DF2<- structure(list(TIMESTAMP = structure(c(1325883600, 1330030800, 
1337371200, 1339876800, 1339531200, 1350590400, 1325797200, 1341950400, 
1335556800, 1340827200), class = c("POSIXct", "POSIXt")), P = c(0, 
0, 0, 0, 1.3, 0, 0, 0, 0, 0), WS = c(4.023, 5.364, 4.917, 
3.129, 4.023, 4.023, 3.576, 4.023, 2.235, 2.682), WD = c(165, 
217.5, 34.5, 292.75, 167.75, 172.75, 319.25, 129.5, 148, 196)), .Names = c("ts", 
"P", "WS", "WD"), row.names = c(6L, 54L, 139L, 286L, 
164L, 292L, 5L, 192L, 118L, 121L), class = "data.frame")

Upvotes: 1

Views: 903

Answers (1)

agstudy
agstudy

Reputation: 121608

You can try this :

> DF1$ts1 <- format(DF1$ts,'%Y-%m-%d')
> DF2$ts1 <- format(DF2$ts,'%Y-%m-%d')
> merge(DF1,DF2,by.x='ts1',by.y ='ts1')
         ts1 LENGTH pLENGTH LENGTH2 BURN pBURN                ts.x                ts.y P    WS     WD
1 2012-06-16     30       0      30    1     1 2012-06-16 06:00:00 2012-06-16 22:00:00 0 3.129 292.75
2 2012-06-27     50       1      50    1     1 2012-06-27 06:00:00 2012-06-27 22:00:00 0 2.682 196.00

Upvotes: 2

Related Questions