superasiantomtom95
superasiantomtom95

Reputation: 553

Merging 2 dataframes with duplicate columns?

I have an empty dataframe as such:

a <- data.frame(x = rep(NA,10))

which gives the following:

    x
1  NA
2  NA
3  NA
4  NA
5  NA
6  NA
7  NA
8  NA
9  NA
10 NA

and I have another dataframe as such (the non-sequential row numbers are because this dataframe is a subset of a much larger dataframe):

    x
1  NA
2  4
3  NA
5  NA
6  5
7  71
8  3

What I want to do is to merge the 2 dataframes together the values from b will replace the current values in x for an output like this:

x
1  NA
2  4
3  NA
4  NA
5  NA
6  5
7  71
8  3
9  NA
10 NA

My first instinct is to use a for loop like this:

for (i in rownames(b)){
    a[i,"x"] <- b[i,"x"]
}

However, this is inefficient for large dataframes. I haven't seen an implementation of this using merge and cbind/rbind yet.

Is there a more efficient way to accomplish this?

Upvotes: 3

Views: 53

Answers (2)

M--
M--

Reputation: 28945

We can merge based on rownames:

a <- data.frame(x = rep(NA,10))
b <- data.frame(x = c(NA,4,NA,NA,5,71,3))

data.frame(x=merge(a, b, by=0, suffixes = c(".a","") ,all=TRUE)[,"x"])
#>     x
#> 1  NA
#> 2  NA
#> 3   4
#> 4  NA
#> 5  NA
#> 6   5
#> 7  71
#> 8   3
#> 9  NA
#> 10 NA

d.b answer is the efficient one.

Upvotes: 2

d.b
d.b

Reputation: 32548

transform(a, x = b[row.names(a),])
#    x
#1  NA
#2   4
#3  NA
#4  NA
#5  NA
#6   5
#7  71
#8   3
#9  NA
#10 NA

Upvotes: 3

Related Questions