LMGagne
LMGagne

Reputation: 1716

Merge based on partial string match in pandas dfs

I have a df that looks like this

first_name last_name
John       Doe
Kelly      Stevens
Dorey      Chang

and another that looks like this

name             email
John Doe         [email protected]
Kelly M Stevens  [email protected]
D Chang          [email protected]

To merge these 2 tables, such that the end result is

first_name last_name email
    John   Doe       [email protected]
    Kelly  Stevens   [email protected]
    Dorey  Chang     [email protected]

I can't merge on name, but all emails contain each persons last name even if the overall format is different. Is there a way to merge these using only a partial string match?

I've tried things like this with no success:

df1['email']= df2[df2['email'].str.contains(df['last_name'])==True]

Upvotes: 2

Views: 140

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150785

IIUC, you can do with merge on the result of an extract:

df1.merge(df2.assign(last_name=df2['name'].str.extract(' (\w+)$'))
             .drop('name', axis=1),
          on='last_name',
          how='left')

Output:

  first_name last_name                      email
0       John       Doe           [email protected]
1      Kelly   Stevens  [email protected]
2      Dorey     Chang          [email protected]

Upvotes: 2

Related Questions