Rebel_47
Rebel_47

Reputation: 79

How to update new column in data-frame with specific column from another data-frame with duplicated in r?

Hello all My df1 looks like

MNO  DOB
123  NA
123  NA
234  NA
234  NA
345  NA
456  NA
456  NA

My df2 have

MNO  DOB
123  22-04-1996
234  16-06-1994
345  05-05-1990
456  18-08-2000

I used merge to get results using df3 <- merge(df1,df2,by.x = "MNO",all.x = TRUE)

I get output looks like

MNO  DOB
123  22-04-1996
123  NA
234  16-06-1994
234  NA
345  05-05-1990
456  18-08-2000
456  NA

Expected results :

MNO  DOB
123  22-04-1996
123  22-04-1996
234  16-06-1994
234  16-06-1994
345  05-05-1990
456  18-08-2000
456  18-08-2000

Upvotes: 3

Views: 57

Answers (2)

agstudy
agstudy

Reputation: 121588

Another solution using data.table with roll=T option:

merge(dx[,list(MNO)],dx2,roll=TRUE)

 MNO        DOB
1: 123 22-04-1996
2: 123 22-04-1996
3: 234 16-06-1994
4: 234 16-06-1994
5: 345 05-05-1990
6: 456 18-08-2000
7: 456 18-08-2000

Upvotes: 2

akrun
akrun

Reputation: 887501

Based on the example showed, the columns should be merged by the same column 'MNO', so we can use by instead of by.x

merge(df1[1], df2, by = "MNO", all.x = TRUE)
#  MNO        DOB
#1 123 22-04-1996
#2 123 22-04-1996
#3 234 16-06-1994
#4 234 16-06-1994
#5 345 05-05-1990
#6 456 18-08-2000
#7 456 18-08-2000

If the output of 'df3' is based on a different code, after the merge step we can fill from tidyr to fill the NA elements with the previous non-NA

library(tidyr)
library(dplyr)
df3 <- df3 %>%
             fill(DOB)

Upvotes: 3

Related Questions