Vignesh
Vignesh

Reputation: 952

Join two data frames with partially common and unequal columns

Is there a better way to leverage the power of merge and join in R? Merge looses unique rows and join creates duplicate and partially filled columns.

Dataframe1
Key  Col1  Col2  Col3
A    1     2     3
B    2     4     6

Dataframe2
Key  Col1  Col2  Col4
A    1     2     4
C    3     6     12
D    4     8     20

Merged Dataframe
Key  Col1  Col2  Col3  Col4
A    1     2     3     4
B    2     4     6     <NA> 
C    3     6     <NA>  12
D    4     8     <NA>  20

Upvotes: 2

Views: 75

Answers (2)

akrun
akrun

Reputation: 887223

We could bind the datasets with bind_rows and then do a group by summarise or reframe to return only the non-NA rows

library(dplyr)
bind_rows(df1, df2) %>% 
  group_by(Key) %>%
  reframe(across(everything(), ~ .x[!is.na(.x)][1])) 

-output

# A tibble: 4 × 5
  Key    Col1  Col2  Col3  Col4
  <chr> <int> <int> <int> <int>
1 A         1     2     3     4
2 B         2     4     6    NA
3 C         3     6    NA    12
4 D         4     8    NA    20

Or may use powerjoin

library(powerjoin)
power_full_join(df1, df2, by = "Key", conflict = coalesce_xy) %>% 
    select(Key, order(names(.)[-1])+1)

-output

   Key Col1 Col2 Col3 Col4
1   A    1    2    3    4
2   B    2    4    6   NA
3   C    3    6   NA   12
4   D    4    8   NA   20

data

df1 <- structure(list(Key = c("A", "B"), Col1 = 1:2, Col2 = c(2L, 4L
), Col3 = c(3L, 6L)), class = "data.frame", row.names = c(NA, 
-2L))

df2 <- structure(list(Key = c("A", "C", "D"), Col1 = c(1L, 3L, 4L), 
    Col2 = c(2L, 6L, 8L), Col4 = c(4L, 12L, 20L)),
 class = "data.frame", row.names = c(NA, 
-3L))

Upvotes: 2

TarJae
TarJae

Reputation: 78937

Here is an option with full_join combined with coalesce:

library(dplyr)
full_join(df1, df2, by="Key") %>% 
  mutate(Col1 = coalesce(Col1.x, Col1.y),
         Col2 = coalesce(Col2.x, Col2.y), .before="Col3") %>% 
  select(-contains("."))

  Key Col1 Col2 Col3 Col4
1   A    1    2    3    4
2   B    2    4    6   NA
3   C    3    6   NA   12
4   D    4    8   NA   20

Upvotes: 2

Related Questions