ABD
ABD

Reputation: 25

Merge 2 R dataframes keeping matched rows from 2nd dataframe and unmatched from 1st

I would like to merge 2 dataframes in by matching the id column in the following way

dfmain = 
  id  name val res
1  1    a    
2  2    b    
3  3    c    
4  4    d    
5  5    e   

and

dfsub = 
  id name  val    res
1  2 two   true   thanks
2  4 four  false  Sorry

to get

dfmain =
   id  name  val    res
1:  1  a      
2:  2  two   true   thanks
3:  3  c      
4:  4  four  false  Sorry
5:  5  e      

Please note that -

  1. the columns in both the dataframes will remain the same in number and names
  2. the id values in the second dataframe will always be a subset of those in the first dataframe

Currently I am using anti_join function to get unmatched rows in the first dataframe and joining the second dataframe to these rows

Is there any more efficient method to do this in place?

Tried using setDT from data.table library but I was only able to update values of one column at a time.

Sorry if I am missing any obvious solution that exists as I am new to R, any help would be appreciated

Upvotes: 1

Views: 1465

Answers (2)

ThomasIsCoding
ThomasIsCoding

Reputation: 101064

You can try (thank data by @Anoushiravan R)

library(data.table)
library(dplyr)

setDT(dfsub)[setDT(dfmain),
  on = "id"
][,
  names(dfmain),
  with = FALSE
][
  ,
  Map(coalesce, .SD, dfmain)
]

which gives

   id name   val    res
1:  1    a    NA   <NA>
2:  2  two  TRUE thanks
3:  3    c    NA   <NA>
4:  4 four FALSE  Sorry
5:  5    e    NA   <NA>

Upvotes: 0

Anoushiravan R
Anoushiravan R

Reputation: 21908

I hope this is what you have in mind, otherwise please let me know. I noticed that you only replaced rows in dfmain with those of the same id in dfsub by also retaining the columns of dfsub, so here is how I think might get you to what you want:

library(dplyr)

dfmain <- tribble(
    ~id,  ~name, ~ val, ~ res,
  1,    "a",    NA,     NA,
  2,    "b",    NA,     NA,
  3,    "c",    NA,     NA,
  4,    "d",    NA,     NA,
  5,    "e" ,   NA,     NA
)

dfsub <- tribble(
    ~id, ~name,  ~val,    ~res,
  2, "two",   TRUE,   "thanks",
  4 ,"four",  FALSE,  "Sorry"
)

dfmain %>%
  filter(! id %in% dfsub$id) %>%
  bind_rows(dfsub) %>%
  arrange(id)

# A tibble: 5 x 4
     id name  val   res   
  <dbl> <chr> <lgl> <chr> 
1     1 a     NA    NA    
2     2 two   TRUE  thanks
3     3 c     NA    NA    
4     4 four  FALSE Sorry 
5     5 e     NA    NA  

Upvotes: 0

Related Questions