Roy1245
Roy1245

Reputation: 507

How to compare values between two dataframe

I have below mentioned two dataframe:

where i want to compare each column of DF_2 with DF_1 corresponding to Unique ID and in Dif column i want to get the exact time difference in Minute between DF_2 and DF_1, and in ST,ST2 and ST3 columns in Required Output i want to match the respective column of DF_2 with DF_1.

DF_1

ID     Date                  Val1     Val2   Val3
AB-55  2017-01-04 18:25:14   adj.f@f  ad2er  25
AB-78  2017-01-08 08:48:14   arj.t@y  ar8ey  258
AB-48  2017-01-09 21:25:45   edR.u@t  wu5eu  75

DF_2

ID     Date                  Val1     Val2   Val3
AB-55  2017-01-04 18:25:14   adj.f@f  ad2er  25
AB-78  2017-01-08 08:58:14   arj.t@y  ar7ey  25
AB-48  2017-01-09 21:25:45   edR.u@t  wu5eu  75

Required Dataframe:

ID     Date                Dif   Val1    ST   Val2  ST2   Val3 ST3
AB-55  2017-01-04 18:25:14 0     adj.f@f T    ad2er T     25   T
AB-78  2017-01-08 08:48:14 10    arj.t@y T    ar8ey F     258  F
AB-48  2017-01-09 21:25:45 0     edR.u@t T    wu5eu T     75   T

Upvotes: 0

Views: 80

Answers (1)

rg255
rg255

Reputation: 4169

Here is a solution using difftime which will work assuming you have the same rows in each data frame.

DF_1$Dif <- as.numeric(difftime(DF_2$Date, DF_1$Date, units = "mins"))

...and the data frame:

> DF_1
     ID                Date Dif
1 AB-55 2017-01-04 18:25:14   0
2 AB-78 2017-01-08 08:48:14  10

Alternatively, if the rows don't match up, first implement a merge first:

DF_3 <- merge(DF_1, DF_2, by = "ID", all.x = T, all.y = T)
DF_3$Dif <- as.numeric(difftime(DF_3$Date.y, DF_3$Date.x, units = "mins"))

And the result...

> DF_3
     ID              Date.x              Date.y Dif
1 AB-55 2017-01-04 18:25:14 2017-01-04 18:25:14   0
2 AB-78 2017-01-08 08:48:14 2017-01-08 08:58:14  10
3 AB-89 2017-01-08 10:35:14                <NA>  NA
4 AD-87                <NA> 2017-01-08 06:15:12  NA

To include the other columns, as requested in the edits you made to the question:

# Here I'm just making your dataframe, please provide code to make reproducible data where possible in the future
DF_1 <- data.frame("ID" = c("AB-55", "AB-78", "AB-89"), "Date" = c("2017-01-04 18:25:14", "2017-01-08 08:48:14", "2017-01-08 10:35:14")) 
DF_2 <- data.frame("ID" = c("AB-55", "AB-78", "AD-87"), "Date" = c("2017-01-04 18:25:14", "2017-01-08 08:58:14", "2017-01-08 06:15:12")) 

DF_1$Val1 <- c("adj.f@f", "adj.t@y", "edR.u@t")
DF_2$Val1 <- c("adj.f@f", "adj.t@y", "edR.u@t")

DF_1$Val2 <- c("ad2er", "ar7ey", "wu5eu")
DF_2$Val2 <- c("ad2er", "ar8ey", "wu5eu")

DF_1$Val3 <- c(25, 258, 75)
DF_2$Val3 <- c(25,  25, 75)

# The merge and difftime part remains
DF_3 <- merge(DF_1, DF_2, by = "ID", all.x = T, all.y = T)
DF_3$Dif <- as.numeric(difftime(DF_3$Date.y, DF_3$Date.x, units = "mins"))

# True/False part
DF_3$ST1 <- DF_3$Val1.x == DF_3$Val1.y
DF_3$ST2 <- DF_3$Val2.x == DF_3$Val2.y
DF_3$ST3 <- DF_3$Val3.x == DF_3$Val3.y

And the data frame:

DF_3


> DF_3
     ID              Date.x  Val1.x Val2.x Val3.x              Date.y  Val1.y Val2.y Val3.y Dif  ST1   ST2   ST3
1 AB-55 2017-01-04 18:25:14 adj.f@f  ad2er     25 2017-01-04 18:25:14 adj.f@f  ad2er     25   0 TRUE  TRUE  TRUE
2 AB-78 2017-01-08 08:48:14 adj.t@y  ar7ey    258 2017-01-08 08:58:14 adj.t@y  ar8ey     25  10 TRUE FALSE FALSE
3 AB-89 2017-01-08 10:35:14 edR.u@t  wu5eu     75                <NA>    <NA>   <NA>     NA  NA   NA    NA    NA
4 AD-87                <NA>    <NA>   <NA>     NA 2017-01-08 06:15:12 edR.u@t  wu5eu     75  NA   NA    NA    NA

Upvotes: 1

Related Questions