bli12blu12
bli12blu12

Reputation: 357

merge two datasets with datatable but keeping one of the identical column

Havent found this one yet so here we go. I am using datatables and I want to merge two tables that differn in length. Ich have done that successfully.

So basically I merge all rows where my key attribute "Name, Date1" from my table1 is euqal to my key attribute "Name, Date2" from my table 2.

The problem I got with this, that my Date2 from my table 2 disapperas. I do not like that. I want to keep my date2 from my table 2.

Example:

Table1

Name   Date1
A      2018-01-01
A      2018-01-02
A      2018-01-03

Table2

Name   Date2
A      2018-01-01
A      2018-01-02
A      2018-01-15

Instead of looking like this in my

New table

Name   Date
A      2018-01-01
A      2018-01-02
A      NA

it should look like this

Name   Date1         Date2
A      2018-01-01    2018-01-01
A      2018-01-02    2018-01-02
A      NA            NA

Does anyone know the trick?

Upvotes: 0

Views: 146

Answers (3)

chinsoon12
chinsoon12

Reputation: 25225

The Date2 is not repeated in the output because it is used as a key and is essentially the same data when you are doing an equi-join. And since you are doing all.x=TRUE, it is a left join and only the left table (i.e. ONE is returned with any other non-key columns).

If you really want, you can duplicate the Date2 into Date1 and use Date1 for joining as follows:

TWO[, Date1:=Date2][ONE, on=.(Name, Date1)]
#   Name      Date2      Date1
#1:    A 2018-01-01 2018-01-01
#2:    A 2018-01-02 2018-01-02
#3:    A         NA 2018-01-03

data:

library(data.table)
ONE <- fread("Name   Date1
A      2018-01-01
A      2018-01-02
A      2018-01-03")
TWO <- fread("Name   Date2
A      2018-01-01
A      2018-01-02
A      2018-01-15")

Upvotes: 1

acylam
acylam

Reputation: 18681

Here is a sqldf solution:

library(sqldf)

sqldf('select a.Name, 
              case when b.Date2 is NULL 
                   then a.Date1 = NULL
                   else a.Date1 
              end as Date1 
              ,b.Date2 
      from df1 as a 
      left join df2 as b
      on a.Name = b.Name and
         a.Date1 = b.Date2')

Result:

  Name      Date1      Date2
1    A 2018-01-01 2018-01-01
2    A 2018-01-02 2018-01-02
3    A       <NA>       <NA>

Data:

df1 = structure(list(Name = structure(c(1L, 1L, 1L), .Label = "A", class = "factor"), 
    Date1 = structure(1:3, .Label = c("2018-01-01", "2018-01-02", 
    "2018-01-03"), class = "factor")), .Names = c("Name", "Date1"
), class = "data.frame", row.names = c(NA, -3L))

df2 = structure(list(Name = structure(c(1L, 1L, 1L), .Label = "A", class = "factor"), 
    Date2 = structure(1:3, .Label = c("2018-01-01", "2018-01-02", 
    "2018-01-15"), class = "factor")), .Names = c("Name", "Date2"
), class = "data.frame", row.names = c(NA, -3L))

Upvotes: 0

Melissa Key
Melissa Key

Reputation: 4551

I would use match. If you're dates are stored in a POSIXct variant or as strings, this should work as is:

t1 <- read.table(text = "Name   Date1
A      2018-01-01
  A      2018-01-02
  A      2018-01-03", header = TRUE)

t2 <- read.table(text = "Name   Date2
A      2018-01-01
A      2018-01-02
A      2018-01-15", header = TRUE)

library(dplyr)
t3 <- t1 %>%
  mutate(
    Date2 = t2$Date2[match(Date1, t2$Date2)], # this grabs Date2 from t2 only if the same Date appears in t1
    Date1 = ifelse(is.na(Date2), NA, Date1) # this sets Date1 to NA if Date2 is NA
  )

If you're dates are stored as factors, the code to set Date1 to NA will need to be modified as follows (I didn't mess with dates, just left them as factors, so I needed this:

Date1 = ifelse(is.na(Date2), NA, as.character(Date1))

(note that this goes in the mutate statement, it doesn't stand alone)

Edit

Here's a variant that does not use mutate that should work on data.tables - although I don't personally use data.table, so please test and confirm:

t3 <- with(t1, {
    Date2 <- t2$Date2[match(Date1, t2$Date2)] # this grabs Date2 from t2 only if the same Date appears in t1
    Date1 <- ifelse(is.na(Date2), NA, Date1) # this sets Date1 to NA if Date2 is NA
})

Upvotes: 0

Related Questions