Reputation: 411
Let's say I have two starting data frames:
df1 <- data.frame(code1 = c("a", "b","z"), code2 = c("2", "3", "4"))
df2 <- data.frame(code1 = c("c", "o", "p"), code2 = c("2", "4", "5"),
column3 = "a", column4 = "b", column5 = "c")
I want to match the two data frames by the column 'code2' and where that's a match, replace the value of code1 in df1 to the value of code1 in df2 so that the final data frame looks like this:
df3<- data.frame(code1 = c("c", "b", "o"), code2 = c("2", "3", "4"))
Upvotes: 1
Views: 520
Reputation: 3326
Here's a solution with dplyr
. It "looks up" code1
in df2
, wherever code2
matches; and when no match is found, it defaults to the original code1
in df1
.
library(dplyr)
# ...
# Code to generate 'df1' and 'df2'.
# ...
df1 %>% mutate(code1 = coalesce(
# Look up the 'code1' according to 'code2'...
df2$code1[match(code2, df2$code2)],
# ...and otherwise default to the original 'code1'.
code1
))
Given df1
and df2
as in your example
df1 <- data.frame(
code1 = c("a", "b","z"),
code2 = c("2", "3", "4")
)
df2 <- data.frame(
code1 = c("c", "o", "p"),
code2 = c("2", "4", "5"),
column3 = "a",
column4 = "b",
column5 = "c"
)
this solution should yield the desired result:
code1 code2
1 c 2
2 b 3
3 o 4
One advantage of using match()
rather than a dplyr::*_join()
: no additional steps are needed to purge extraneous columns from your results.
Upvotes: 0
Reputation: 29203
Using left_join
and coalesce
:
library(dplyr)
df1 %>%
left_join(df2[,c(1,2)], by = "code2") %>%
transmute(code1 = coalesce(code1.y, code1.x),
code2 = code2)
#> code1 code2
#> 1 c 2
#> 2 b 3
#> 3 o 4
Upvotes: 3