Overwrite values in Dataframe based on another Dataframe matched by two columns

I have a dataframe where the first column has dates, the second column has IDs, and other columns have values. Some of the values in one column, let's say the third one, should be updated or inserted based on another dataframe if columns 1 and 2 match. However, not all columns in both dataframes are the same.

There is this solution, however, it does not work for two columns or dataframes with different sizes.

Here is some example data to make what I need clearer:

df1=structure(list(Date = structure(c(19216, 19216, 19219, 19219), class = "Date"), 
    ID = c("id1", "id2", "id1", "id2"), X1 = c(-8, -10, 5, 11
    ), X2 = c(0, 0, 0, 0), X3 = c("A", "A", "A", "A")), row.names = c(NA, 
-4L), class = "data.frame")

df2=structure(list(Date = structure(c(19216, 19219, 19220), class = "Date"), 
    ID = c("id1", "id1", "id1"), X1 = c(-3, 0, 2), Y = c(2, 2, 
    -1)), class = "data.frame", row.names = c(NA, -3L))

df_result=structure(list(Date = structure(c(19216, 19216, 19219, 19219, 
19220), class = "Date"), ID = c("id1", "id2", "id1", "id2", "id1"
), X1 = c(-3, -10, 0, 11, 2), X2 = c(0, 0, 0, 0, NA), X3 = c("A", 
"A", "A", "A", NA)), row.names = c(NA, 5L), class = "data.frame")

Upvotes: 0

Views: 584

Answers (1)

eivicent
eivicent

Reputation: 273

You can use dplyr::rows_upsert to update existing rows in your first data frame and add the new ones. Just make sure that both dataframes have the same columns and that there are no duplicates in the second dataframe. Keys that you are trying to update should be unique

library(dplyr)
df_result2 = rows_upsert(df1, select(df2, -Y), by = c("Date", "ID"))
print(df_result2)

        Date  ID  X1 X2   X3
1 2022-08-12 id1  -3  0    A
2 2022-08-12 id2 -10  0    A
3 2022-08-15 id1   0  0    A
4 2022-08-15 id2  11  0    A
5 2022-08-16 id1   2 NA <NA>

This seems to work.

Upvotes: 2

Related Questions