baqm
baqm

Reputation: 131

How to merge pandas dataframe based on substring of column elements python

I have two dataframes A and B. Dataframe A looks like:

col1    col2   col3
 a_low   5      6
 a_low   3      10
 a_high  4      4

Dataframe B looks like:

col1 colB 
 a    90

Now, I want to merge df A and B on the substring a in col1 from df A and col1 from df B. Hence, the result should be:

col1    col2   col3  colB
 a_low   5      6     90
 a_low   3      10    90
 a_high  4      4     90

Anyone knows how to do this using merge?

Upvotes: 0

Views: 90

Answers (3)

Anurag Dabas
Anurag Dabas

Reputation: 24314

You can also use assign() and merge() method:

result=dfA.assign(col1=dfA['col1'].str.split('_').str[0]).merge(dfB).assign(col1=dfA['col1'])

Now if you print result you will get your desired output:

    col1    col2    col3    colB
0   a_low   5       6       90
1   a_low   3       10      90
2   a_high  4       4       90

Upvotes: 2

azal
azal

Reputation: 1260

You can use str.extract + insert:

pat = "|".join(df_a.col1)
df_b.insert(0, 'a', df_b['col1'].str.extract("(" + pat + ')', expand=False))

Upvotes: 1

Quang Hoang
Quang Hoang

Reputation: 150785

You need to extract the part of string from col1, e.g. with str.split or str.extract(), then either merge or map:

dfA['colB'] = (dfA['col1'].str.split('_').str[0]
                  .map(dfB.set_index('col1')['colB']
              )

Upvotes: 2

Related Questions