Merge data.frame in R

Question

I have a question regarding a specific type of merge using data.frame in R (found a lot of similar problems, but couldn't get it to work for my specific problem)

Suppose I have two dataframes with tow columns X1,X2 each:

df1 =

            X1         X2
    1  '01.01.2000'    4
    2  '01.01.2001'    5
    3  '01.01.2002'    6

df2 =

            X1         X2
    1  '01.01.2002'    8
    2  '01.01.2003'    9
    3  '01.01.2004'    10

What I want is a merged dataframe according to the following rules:

If a value in X1 is only in df1, use the value of X2 in df1
If a value in X1 is in both df1 and df2 use the value of X2 from df2
If a value in X1 is only in df2, use the value of X2 in df2

For df1 and df2 above, this would mean:

dfMerged =

            X1         X2
    1  '01.01.2000'    4
    2  '01.01.2001'    5
    3  '01.01.2002'    8
    4  '01.01.2003'    9
    5  '01.01.2004'    10

Currently, I'm using a very slow solution by merging first and then iterating over all rows. Also tried various approaches using dplyr::Union etc, but couldn't find a proper solution. Any help is greatly appreciated!

Teun · Accepted Answer

You could use the following. It just row binds the data.frames and in case of duplicates (based on X1) the row of df1 will be removed.

library(dplyr)
df1 <- data.frame(X1 = c("01.01.2000", "01.01.2001", "01.01.2002"),
                  X2 = c(4, 5, 6), stringsAsFactors = F)
df2 <- data.frame(X1 = c("01.01.2002", "01.01.2003", "01.01.2004"),
                  X2 = c(8, 9, 10), stringsAsFactors = F)

dfMerged <- bind_rows(df2, df1) %>% 
  distinct(X1, .keep_all = TRUE) %>% 
  arrange(X1, X2)

Merge data.frame in R

Answers (2)

Related Questions