Reputation: 25
I would like to merge 2 dataframes in by matching the id column in the following way
dfmain =
id name val res
1 1 a
2 2 b
3 3 c
4 4 d
5 5 e
and
dfsub =
id name val res
1 2 two true thanks
2 4 four false Sorry
to get
dfmain =
id name val res
1: 1 a
2: 2 two true thanks
3: 3 c
4: 4 four false Sorry
5: 5 e
Please note that -
Currently I am using anti_join function to get unmatched rows in the first dataframe and joining the second dataframe to these rows
Is there any more efficient method to do this in place?
Tried using setDT from data.table library but I was only able to update values of one column at a time.
Sorry if I am missing any obvious solution that exists as I am new to R, any help would be appreciated
Upvotes: 1
Views: 1465
Reputation: 101064
You can try (thank data by @Anoushiravan R)
library(data.table)
library(dplyr)
setDT(dfsub)[setDT(dfmain),
on = "id"
][,
names(dfmain),
with = FALSE
][
,
Map(coalesce, .SD, dfmain)
]
which gives
id name val res
1: 1 a NA <NA>
2: 2 two TRUE thanks
3: 3 c NA <NA>
4: 4 four FALSE Sorry
5: 5 e NA <NA>
Upvotes: 0
Reputation: 21908
I hope this is what you have in mind, otherwise please let me know. I noticed that you only replaced rows in dfmain with those of the same id in dfsub by also retaining the columns of dfsub, so here is how I think might get you to what you want:
library(dplyr)
dfmain <- tribble(
~id, ~name, ~ val, ~ res,
1, "a", NA, NA,
2, "b", NA, NA,
3, "c", NA, NA,
4, "d", NA, NA,
5, "e" , NA, NA
)
dfsub <- tribble(
~id, ~name, ~val, ~res,
2, "two", TRUE, "thanks",
4 ,"four", FALSE, "Sorry"
)
dfmain %>%
filter(! id %in% dfsub$id) %>%
bind_rows(dfsub) %>%
arrange(id)
# A tibble: 5 x 4
id name val res
<dbl> <chr> <lgl> <chr>
1 1 a NA NA
2 2 two TRUE thanks
3 3 c NA NA
4 4 four FALSE Sorry
5 5 e NA NA
Upvotes: 0