William Humphries
William Humphries

Reputation: 568

Combine dataframes and overwrite values in table 1 with all values in table 2

I have two dataframes that need to be merged on a single matching column and the second table needs to replace all values in the first table where the columns match.

I've tried several variations of the below code but nothing is returning a dataframe the same size as the original.

merge.data.frame(x, y, by.x = "Name", by.y = "Name")
merge.data.frame(x, y, by.x = "SN", all.x = FALSE,y all.y = TRUE)

The variables being used for this are shown below

x <- data.frame("SN" = 1:4, "Age" = c(21,15,44,55), "Name" = c("John","Dora", NA, NA))
y <- data.frame("SN" = 4, "Age" = c(100), "Name" = c("B"))
  SN Age Name       # x dataframe
1  1  21 John 
2  2  15 Dora 
3  3  44  NA 
4  4  55  NA

  SN Age Name       # y dataframe
5  4 100  B

The final result of the dataframe should be what is below:

Joined on "SN" and overwriting the values in columns "Age" and "Name" replacing what is in x with what is in y.

  SN Age Name       # result
1  1  21 John 
2  2  15 Dora 
3  3  44  NA 
4  4 100  B

Edit: If you have additional columns as shown in the dataframe below that are not in y

  SN Gender Age Name       # z table with additional column
1  1   M    21  John 
2  2   F    15  Dora 
3  3   M    44   NA 
4  4   M    55   NA

Running the code below from plyr package will drop unnecessary columns

library(plyr)
rbind.fill(z[!z$SN %in% y$SN,], y[,])
  SN Gender Age Name       # result if additional columns are present in z and not in y
1  1   M    21  John 
2  2   F    15  Dora 
3  3   M    44   NA 
4  4   M    100  B

Upvotes: 0

Views: 511

Answers (2)

davsjob
davsjob

Reputation: 1960

A longer solution that I find easier to read is to first filter rows in x that are presend in y, then add y to x.

# Added stringsAsFactors as False in df's.
x <- data.frame("SN" = 1:4, "Age" = c(21,15,44,55), "Name" = c("John","Dora", NA, NA), stringsAsFactors = F)
y <- data.frame("SN" = 4, "Age" = c(100), "Name" = c("B"), stringsAsFactors = F)

# Joins
x %>% 
  anti_join(y, by = c("SN")) %>%  # Remove row from x that are in y
  bind_rows(y)                    # Add rows from y to x

Upvotes: 2

lroha
lroha

Reputation: 34586

You can remove SN values in x that match SN values in y then row bind the the two dataframes.

rbind(x[!x$SN %in% y$SN,], y) 

  SN Age Name
1  1  21 John
2  2  15 Dora
3  3  44 <NA>
4  4 100    B

Upvotes: 4

Related Questions