kaix
kaix

Reputation: 305

Merging two data frame where row duplicates

I have 2 data frame where the first one df1 have duplicated row. I would like to cbind it with df2 to get the desired result df3

>df1
 Months      A     B     C
1 Jan        4     NA     0
2 Feb        6     NA     0
3 Mar        9     NA    21
4 Jan        NA    1     NA
5 Feb        NA    0     NA
6 Mar        NA    6     NA



   >df2
 Months      D     E     F
1 Jan        1     1     0
2 Feb        2     1     0
3 Mar        3     1     0

I would like to merge df1 and df2, remove those duplicated rows in df1. df3 is my desired result. Thanks.

  >df3
 Months      A     B     C    D     E    F
1 Jan        4     1     0    1     1    0
2 Feb        6     0     0    2     1    0
3 Mar        9     6    21    3     1    0




df1 >
 structure(list(Months = c("Jan", "Feb", "Mar", "Jan", "Feb", "Mar"), A = c("4", "6", "9", "NA", 
 "NA", "NA"), B = c("NA", "NA", 
 "NA", "1", "0", "6"), C = c("0", "0", "21", "NA", "NA", "NA")), row.names = c(NA, 
 -6L), class = c("tbl_df", "tbl", "data.frame"))



df2 >
structure(list(Months = c("Jan", "Feb", "Mar"), D = c(1, 2, 3
), E = c(1, 1, 1), F = c(0, 0, 0)), row.names = c(NA, -3L), class = c("tbl_df", 
"tbl", "data.frame"))

Upvotes: 1

Views: 49

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388982

You actually have string "NA" in your data and not actual NA values. Try to drop "NA" values from the columns in df1 and then join it with df2.

library(dplyr)

df1 %>%
  group_by(Months) %>%
  summarise(across(A:C, ~.[. != 'NA'])) %>%
  inner_join(df2, by = 'Months')

#  Months A     B     C         D     E     F
#  <chr>  <chr> <chr> <chr> <dbl> <dbl> <dbl>
#1 Feb    6     0     0         2     1     0
#2 Jan    4     1     0         1     1     0
#3 Mar    9     6     21        3     1     0

Upvotes: 3

Related Questions