Hanif
Hanif

Reputation: 397

Merge dataframes based on columns and rbind matches

I have two dataframes as:

df <- data.frame(x=c("a","c"), y=c("111","222"))

df1 <- data.frame(x=c("a","b","b","c"), y=c("111","222","111","222"),z=c('xxx','yyy','ddd','ttt'))

I want to merge these two dataframes based on column x, in such a way that the matched rows from df1 should be added as rows to df and now as columns as traditionally the merge function joins dataframes on columns. The resulting dataframe should look like:

X  Y   Z
a 111 <NA>
a 111 xxx
c 222 <NA>
c 222 ttt

I know it can be done through for loops but I was wondering if it is possible through one/two line of code instead of writing multiple for loops.

Upvotes: 0

Views: 118

Answers (2)

s_baldur
s_baldur

Reputation: 33613


A base R option:

# Add missing column to df
df[setdiff(names(df1), names(df))] <- NA
# rbind only rows matching on x
rbind(df, df1[df1$x %in% df$x,])
   x   y    z
1  a 111 <NA>
2  c 222 <NA>
11 a 111  xxx
4  c 222  ttt

A dplyr option:

library(dplyr)
bind_rows(df, filter(df1, x %in% df$x))
  x   y    z
1 a 111 <NA>
2 c 222 <NA>
3 a 111  xxx
4 c 222  ttt

A data.table option:

library(data.table)
setDT(df)
setDT(df1)
rbind(df, df1[x %in% df$x], fill = TRUE)
   x   y    z
1: a 111 <NA>
2: c 222 <NA>
3: a 111  xxx
4: c 222  ttt

Upvotes: 3

Georgery
Georgery

Reputation: 8127

library(dplyr)

df1 %>%
    semi_join(df, by = "x") %>%
    bind_rows(df)

  x   y    z
1 a 111  xxx
2 c 222  ttt
3 a 111 <NA>
4 c 222 <NA>

Upvotes: 2

Related Questions