How to get the keyword that was matched from a list of keywords while searching in every row of a dataframe?

Question

I have a column "Description" in my dataframe and I am searching this column for a list of keywords. I was able to return True or False values if the keyword is present in the particular row. I want to add one more column which shows which keyword from the list was matched with the data in that row.

for example:

content = ['paypal', 'silverline', 'bcg', 'onecap']

#dataframe df

Description        Debit  Keyword_present 

onech xmx paypal    555     True
xxl 1ef yyy         141     False
bcg tte exact       411     True

And the new column should look like:

 Keyword
 paypal
 NA
 bcg

Till now, I have tried getting T/F values if the keywords are present.

#content is my list of keywords

present = new_df['Description'].str.contains('|'.join(content)) 

new_df['Keyword Present'] = present

Quang Hoang · Accepted Answer

Instead of contains, use extract with somewhat different pattern:

pattern = '(' + '|'.join(content) + ')'
df['Keyword Present'] = df.Description.str.extract(pattern)

Output:

        Description  Debit  Keyword_present Keyword Present
0  onech xmx paypal    555             True          paypal
1       xxl 1ef yyy    141            False             NaN
2     bcg tte exact    411             True             bcg

How to get the keyword that was matched from a list of keywords while searching in every row of a dataframe?

Answers (2)

Related Questions