John E.
John E.

Reputation: 137

R dplyr How to perform left_join with different keys when one key is not available

How can I perform a left_join using dplyr with two different keys and then, when the second key is not available, join the tables by using only the first key.

Thanks you

EDIT

Here is the logic:

If the two keys are found in the dataset, use both keys. If the second key is not found (NA or no matching) in the dataset, use only the first one as merging key.

Upvotes: 0

Views: 859

Answers (1)

Jakub.Novotny
Jakub.Novotny

Reputation: 3047

You can try something along these lines:

library(tidyverse)

df1 <- tibble(
  key1 = c("A", "B"),
  key2 = c(1, 2),
  value_df1 = runif(2)
)

df2 <- tibble(
  key1 = c("A", "B"),
  key2 = c(1, NA),
  value_df2 = runif(2)
)

df_merged <- df1 %>%
  left_join(df2, by = c("key1", "key2")) %>%
  left_join(df2 %>% select(-key2), by = "key1") %>%
  mutate(value2 = coalesce(value_df2.x, value_df2.y)) %>%
  select(key1, key2, value_df1, value_df2)

Upvotes: 1

Related Questions