Reputation: 137
How can I perform a left_join using dplyr
with two different keys and then, when the second key is not available, join the tables by using only the first key.
Thanks you
EDIT
Here is the logic:
If the two keys are found in the dataset, use both keys. If the second key is not found (NA
or no matching) in the dataset, use only the first one as merging key.
Upvotes: 0
Views: 859
Reputation: 3047
You can try something along these lines:
library(tidyverse)
df1 <- tibble(
key1 = c("A", "B"),
key2 = c(1, 2),
value_df1 = runif(2)
)
df2 <- tibble(
key1 = c("A", "B"),
key2 = c(1, NA),
value_df2 = runif(2)
)
df_merged <- df1 %>%
left_join(df2, by = c("key1", "key2")) %>%
left_join(df2 %>% select(-key2), by = "key1") %>%
mutate(value2 = coalesce(value_df2.x, value_df2.y)) %>%
select(key1, key2, value_df1, value_df2)
Upvotes: 1