Reputation: 397
I have two dataframes as:
df <- data.frame(x=c("a","c"), y=c("111","222"))
df1 <- data.frame(x=c("a","b","b","c"), y=c("111","222","111","222"),z=c('xxx','yyy','ddd','ttt'))
I want to merge these two dataframes based on column x, in such a way that the matched rows from df1 should be added as rows to df and now as columns as traditionally the merge function joins dataframes on columns. The resulting dataframe should look like:
X Y Z
a 111 <NA>
a 111 xxx
c 222 <NA>
c 222 ttt
I know it can be done through for loops but I was wondering if it is possible through one/two line of code instead of writing multiple for loops.
Upvotes: 0
Views: 118
Reputation: 33613
A base R
option:
# Add missing column to df
df[setdiff(names(df1), names(df))] <- NA
# rbind only rows matching on x
rbind(df, df1[df1$x %in% df$x,])
x y z
1 a 111 <NA>
2 c 222 <NA>
11 a 111 xxx
4 c 222 ttt
A dplyr
option:
library(dplyr)
bind_rows(df, filter(df1, x %in% df$x))
x y z
1 a 111 <NA>
2 c 222 <NA>
3 a 111 xxx
4 c 222 ttt
A data.table
option:
library(data.table)
setDT(df)
setDT(df1)
rbind(df, df1[x %in% df$x], fill = TRUE)
x y z
1: a 111 <NA>
2: c 222 <NA>
3: a 111 xxx
4: c 222 ttt
Upvotes: 3
Reputation: 8127
library(dplyr)
df1 %>%
semi_join(df, by = "x") %>%
bind_rows(df)
x y z
1 a 111 xxx
2 c 222 ttt
3 a 111 <NA>
4 c 222 <NA>
Upvotes: 2