Reputation: 35
I have the following two data frames
df1
Animal Categ_Class
--------------------------
Cat Soft
Dog Soft
Dinosaur Hard
df2
Text Animal_Exist
-----------------------------------------------
The Cat is purring True
Cat drank the milk True
Lizard is crawling over the wall False
The dinosaurs are extinct now True
The column in df2 is derived from df1.Animal existing in df2.Text
I need help in understanding the code to write that I can get an output like this
Output
Text Animal_Exist Categ_Class
--------------------------------------------------------------
The Cat is purring True Soft
Cat drank the milk True Soft
Lizard is crawling over the wall False NA
The dinosaurs are extinct now True Hard
I am new to python and have been trying this multiple ways since days. Any help is appreciated.
Regards.
Upvotes: 1
Views: 64
Reputation: 863301
Use Series.str.extract
for get values of Animal
converted to lowercase and then use Series.map
import re
s = df1.assign(Animal = df1['Animal'].str.lower()).set_index('Animal')['Categ_Class']
pat = f'({"|".join(s.index)})'
cat = df2['Text'].str.extract(pat, expand=False, flags=re.I).str.lower().map(s)
df2 = df2.assign(Animal_Exist = cat.notna(), Categ_Class = cat)
print (df2)
Text Animal_Exist Categ_Class
0 The Cat is purring True Soft
1 Cat drank the milk True Soft
2 Lizard is crawling over the wall False NaN
3 The dinosaurs are extinct now True Hard
Upvotes: 1