Vector JX
Vector JX

Reputation: 179

Match multiple condition on large dataframe in R

I have below mentioned two dataframe:

DF_1

Val1           Val2
COPPAR Ert     Metal          
Bittar Gourd   vegetble
Blackbery d    Fruite

DF_2

Val4           Val5        Type
Copper         Metal       A-I   
Bitter Gourd   Vegetable   B-II
Blackberry     Fruit       C-III

I have some error in DF_1 in Val1 and Val2 (Where same like string in Val1 and Val2 are different in spelling) and have the correct list in DF_2. Just want to Match Val1 of DF_1 with Val4 of DF_2 and based on the correct value (New_Val1) I want the Val5 in New_Val2 and Type, in the output dataframe.

Output Dataframe:

Val1           Val2      New_Val1       New_Val2    Type
COPPAR         Metal     Copper Ert     Metal       A-I      
Bittar Gourd   vegetble  Bitter Gourd   Vegetable   B-II
Blackbery      Fruite    Blackberry     Fruit       C-III

Upvotes: 3

Views: 80

Answers (1)

BENY
BENY

Reputation: 323226

This is base on soundex

library(phonics)

df1['match1']=soundex(df1$Val1)
df1['match2']=soundex(df1$Val2)
df2['match1']=soundex(df2$Val4)
df2['match2']=soundex(df2$Val5)
merge(df1,df2,by=c('match1','match2'))
  match1 match2         Val1     Val2         Val4      Val5  Type
1   B360   V231 Bittar Gourd vegetble Bitter Gourd Vegetable  B-II
2   B421   F630  Blackbery d   Fruite   Blackberry     Fruit C-III
3   C160   M340   COPPAR Ert    Metal       Copper     Metal   A-I

Upvotes: 3

Related Questions