Reputation: 357
Havent found this one yet so here we go. I am using datatables and I want to merge two tables that differn in length. Ich have done that successfully.
So basically I merge all rows where my key attribute "Name, Date1" from my table1 is euqal to my key attribute "Name, Date2" from my table 2.
The problem I got with this, that my Date2 from my table 2 disapperas. I do not like that. I want to keep my date2 from my table 2.
Example:
Table1
Name Date1
A 2018-01-01
A 2018-01-02
A 2018-01-03
Table2
Name Date2
A 2018-01-01
A 2018-01-02
A 2018-01-15
Instead of looking like this in my
New table
Name Date
A 2018-01-01
A 2018-01-02
A NA
it should look like this
Name Date1 Date2
A 2018-01-01 2018-01-01
A 2018-01-02 2018-01-02
A NA NA
Does anyone know the trick?
Upvotes: 0
Views: 146
Reputation: 25225
The Date2 is not repeated in the output because it is used as a key and is essentially the same data when you are doing an equi-join. And since you are doing all.x=TRUE
, it is a left join and only the left table (i.e. ONE
is returned with any other non-key columns).
If you really want, you can duplicate the Date2 into Date1 and use Date1 for joining as follows:
TWO[, Date1:=Date2][ONE, on=.(Name, Date1)]
# Name Date2 Date1
#1: A 2018-01-01 2018-01-01
#2: A 2018-01-02 2018-01-02
#3: A NA 2018-01-03
data:
library(data.table)
ONE <- fread("Name Date1
A 2018-01-01
A 2018-01-02
A 2018-01-03")
TWO <- fread("Name Date2
A 2018-01-01
A 2018-01-02
A 2018-01-15")
Upvotes: 1
Reputation: 18681
Here is a sqldf
solution:
library(sqldf)
sqldf('select a.Name,
case when b.Date2 is NULL
then a.Date1 = NULL
else a.Date1
end as Date1
,b.Date2
from df1 as a
left join df2 as b
on a.Name = b.Name and
a.Date1 = b.Date2')
Result:
Name Date1 Date2
1 A 2018-01-01 2018-01-01
2 A 2018-01-02 2018-01-02
3 A <NA> <NA>
Data:
df1 = structure(list(Name = structure(c(1L, 1L, 1L), .Label = "A", class = "factor"),
Date1 = structure(1:3, .Label = c("2018-01-01", "2018-01-02",
"2018-01-03"), class = "factor")), .Names = c("Name", "Date1"
), class = "data.frame", row.names = c(NA, -3L))
df2 = structure(list(Name = structure(c(1L, 1L, 1L), .Label = "A", class = "factor"),
Date2 = structure(1:3, .Label = c("2018-01-01", "2018-01-02",
"2018-01-15"), class = "factor")), .Names = c("Name", "Date2"
), class = "data.frame", row.names = c(NA, -3L))
Upvotes: 0
Reputation: 4551
I would use match
. If you're dates are stored in a POSIXct
variant or as strings, this should work as is:
t1 <- read.table(text = "Name Date1
A 2018-01-01
A 2018-01-02
A 2018-01-03", header = TRUE)
t2 <- read.table(text = "Name Date2
A 2018-01-01
A 2018-01-02
A 2018-01-15", header = TRUE)
library(dplyr)
t3 <- t1 %>%
mutate(
Date2 = t2$Date2[match(Date1, t2$Date2)], # this grabs Date2 from t2 only if the same Date appears in t1
Date1 = ifelse(is.na(Date2), NA, Date1) # this sets Date1 to NA if Date2 is NA
)
If you're dates are stored as factors, the code to set Date1
to NA will need to be modified as follows (I didn't mess with dates, just left them as factors, so I needed this:
Date1 = ifelse(is.na(Date2), NA, as.character(Date1))
(note that this goes in the mutate statement, it doesn't stand alone)
Here's a variant that does not use mutate
that should work on data.tables
- although I don't personally use data.table
, so please test and confirm:
t3 <- with(t1, {
Date2 <- t2$Date2[match(Date1, t2$Date2)] # this grabs Date2 from t2 only if the same Date appears in t1
Date1 <- ifelse(is.na(Date2), NA, Date1) # this sets Date1 to NA if Date2 is NA
})
Upvotes: 0