Dr.E77
Dr.E77

Reputation: 107

Conditional matching based on two variables

I have a DF with similar multiple rows and columns. I want to match Name with Name1. If Name does not match Name1, I want to also try to matchName to Name3. Then if it matches, I want the value of Name2 to return as a new variable, Namevalue.

df
Name      Name1    Name2   Name3
Natalie   Shawn    Edmund  Natalie
Desmond   Desmond  James   Sue
Kylie     Kylie    Brent   Manny

Desired output

df
Name      Name1    Name2   Name3     Namevalue
Natalie   Shawn    Edmund  Natalie   Edmund
Desmond   Desmond  Tyler   Sue       Tyler
Kylie     Chris    Tam     Kylie     Tam
Leah      Chewie   Luke    Solo      Unknown

I've tried this:

Merge1$Namevalue <- ifelse(Merge1$Name %in% Merge1$Name1, Merge1$Name2, 
                    paste("Unknown"),
                    ifelse(Merge1$Name %in% Merge1$Name3, Merge1$Name2, paste("Unknown"))

Unfortunately, it doesn't work where it doesn't match properly. Any help would be appreciated.

Upvotes: 3

Views: 120

Answers (1)

fabla
fabla

Reputation: 1816

Your ifelse() idea was a good start, however, you got the condition slightly wrong, hope this helps:

Code

Test: if there is a match in name1 or (|) in name3, assign name2:

df$Namevalue <- ifelse(df$Name %in% df$Name1 | df$Name %in% df$Name3,
                       df$Name2,
                       NA)
#
     Name   Name1  Name2   Name3 Namevalue
1 Natalie   Shawn Edmund Natalie    Edmund
2 Desmond Desmond  Tyler     Sue     Tyler
3   Kylie   Chris    Tam   Kylie       Tam
4    Leah  Chewie   Luke    Solo      <NA>

Updated Answer

If one wants to know if name1 or name3 was a match.

You can first match only against name1,

df$Namevalue <- ifelse(df$Name %in% df$Name1,
                       df$Name2,
                       NA)
#
     Name   Name1  Name2   Name3 Namevalue
1 Natalie   Shawn Edmund Natalie      <NA>
2 Desmond Desmond  Tyler     Sue     Tyler
3   Kylie   Chris    Tam   Kylie      <NA>
4    Leah  Chewie   Luke    Solo      <NA>

then create a column that displays if there was a match against name1,

df$match <- rep(NA, length(df$Namevalue))
df$match[which(!is.na(df$Namevalue))] <- "Name1"
#
     Name   Name1  Name2   Name3 Namevalue match
1 Natalie   Shawn Edmund Natalie      <NA>  <NA>
2 Desmond Desmond  Tyler     Sue     Tyler Name1
3   Kylie   Chris    Tam   Kylie      <NA>  <NA>
4    Leah  Chewie   Luke    Solo      <NA>  <NA>

then match against name3 and assign name3 (in the last column) to values that have no match with name1 but a match with name3,

df$Namevalue <- ifelse(df$Name %in% df$Name3 & is.na(df$Namevalue),
                       df$Name2,
                       df$Namevalue)
#
df$match[which(is.na(df$match) & !is.na(df$Namevalue))] <- "Name3"
#
     Name   Name1  Name2   Name3 Namevalue match
1 Natalie   Shawn Edmund Natalie    Edmund Name3
2 Desmond Desmond  Tyler     Sue     Tyler Name1
3   Kylie   Chris    Tam   Kylie       Tam Name3
4    Leah  Chewie   Luke    Solo      <NA>  <NA>

the remaining NA's have no match in name1 or name3.

Data

df <- read.table(text = "
Name      Name1    Name2   Name3    
Natalie   Shawn    Edmund  Natalie   
Desmond   Desmond  Tyler   Sue       
Kylie     Chris    Tam     Kylie     
Leah      Chewie   Luke    Solo", header = T, stringsAsFactor = F)
#
     Name   Name1  Name2   Name3
1 Natalie   Shawn Edmund Natalie
2 Desmond Desmond  Tyler     Sue
3   Kylie   Chris    Tam   Kylie
4    Leah  Chewie   Luke    Solo

Upvotes: 2

Related Questions