ben_aaron
ben_aaron

Reputation: 1532

Fill NAs in dataframe with a matching dataframe merge in R

I have three data frames. One 'main' data frame to which I want to merge the other two sub data frames.

main = data.frame(a = LETTERS[1:5], b = round(rnorm(5),2))

a     b
A  0.41
B -1.06
C -0.65
D -1.27
E  1.03

sub1 = data.frame(a = LETTERS[2:4], c = sample(c(100, 200, 300), 3, replace=T))

a   c
B 200
C 200
D 200

sub2 = data.frame(a = LETTERS[c(1,5)], c = sample(c(999, 888), 2, replace=T))

a   c
A 999
E 888

In the first merge, I want to merge sub1 to main via "a" and this works.

merged1 = merge(main, sub1, by="a", all.x=T)

a     b   c
A  0.41  NA
B -1.06 300
C -0.65 300
D -1.27 100
E  1.03  NA

Now I want to join/merge the sub2 to merged1. In my case, the values provided by sub2 substitute the NAs in column c. Specifically I want to replace these NAs with the values from sub2. I tried this (and other arguments for all.x, all.y:

merge(merged1, sub2, by="a", all.x=T)

a     b c.x c.y
A  0.41  NA 999
B -1.06 300  NA
C -0.65 300  NA
D -1.27 100  NA
E  1.03  NA 999

How can I get this to be only one column c with the merge?

Upvotes: 0

Views: 219

Answers (1)

Chris P
Chris P

Reputation: 225

Note: My seed was set, so the results are different than those in the question.

If you bind your sub1 or sub2 together, you can do it all in one merge.

set.seed(42)

main = data.frame(a = LETTERS[1:5], b = round(rnorm(5),2))

sub1 = data.frame(a = LETTERS[2:4], c = sample(c(100, 200, 300), 3, replace=T))

sub2 = data.frame(a = LETTERS[c(1,5)], c = sample(c(999, 888), 2, replace=T))

#Create one data frame
all_sub <- rbind(sub1,sub2)

merged <- merge(main, all_sub, by="a", all.x=T)

The output is as expected:

> merged
  a     b   c
1 A  1.37 999
2 B -0.56 200
3 C  0.36 300
4 D  0.63 300
5 E  0.40 999

Upvotes: 4

Related Questions