Bonono
Bonono

Reputation: 847

Select non NA values in match when matching columns

I have two data.frames, aa and bb:

    aa <- data.frame(ID = c(rep(letters[1:4]),"d","d", "e","e"), Name = c("Dave", "Charlie", "Ella", NA,"Timothy","James",NA, "Bob"))
aa
  ID    Name
1  a    Dave
2  b Charlie
3  c    Ella
4  d    <NA>
5  d Timothy
6  d   James
7  e    <NA>
8  e     Bob

bb <- data.frame(ID = rep(letters[1:5]),Name = 0)

 bb
  ID Name
1  a    0
2  b    0
3  c    0
4  d    0

I want to fill bb$Name by matching bb$ID with aa$ID so that I get the earliest name in the alphabet (in this case James):

 ID    Name
1  a    Dave
2  b Charlie
3  c    Ella
4  d   James
5  e    Bob

However, if I try bb$Name <- aa[match(bb$ID,aa$ID),2] I get:

bb

     ID    Name
1  a    Dave
2  b Charlie
3  c    Ella
4  d    <NA>
5  e    <NA>

Do you think some if statement code along the lines of: if there are more than 1 of the same values in aa$ID, and if one of those values are NA, then select the one that is of earlier alphabetical order to match with. So the d value in bb$Name would be James

I understand I could subset the original data but I don't want to do that in my actual data as the information that aa$Name has an NA value is important

Upvotes: 1

Views: 2960

Answers (1)

A.Yazdiha
A.Yazdiha

Reputation: 1378

Is it ok to create a temporary copy of aa?

tmp <- aa[!is.na(aa$Name),]
tmp <- tmp[order(tmp$Name),]
bb$Name <- tmp[match(bb$ID,tmp$ID),2]
bb

This gets you the result you wanted.

Upvotes: 2

Related Questions