Reputation: 346
I have a string. I need to find if the keywords in that string is present in my dataframe.
If present, I need to return that keyword back.
String:
question="Joe is Available"
question=question.upper()
str_list=question.split()
str_list
Out[107]:
['JOE', 'IS', 'AVAILABLE']
Dataframe:
df=pd.DataFrame({"Person1":("Ash","Joe","Harry"),"Person2":("Abe","Lisa","Katty",),"Person3":("Sam","Max","Stone")})
df=df.apply(lambda x: x.astype(str).str.upper())
Person1 Person2 Person3
ASH ABE SAM
JOE LISA MAX
HARRY KATTY STONE
My Attempt:
return_field=""
for x in str_list:
print(x)
for i in df.iterrows():
if(df.str.contains(x)):
return_field=x
Gives me AttributeError: 'DataFrame' object has no attribute 'str'
Expected Output
Since Joe is present in the dataframe, it should return me back "Joe"
Upvotes: 1
Views: 60
Reputation: 164613
If you do this repeatedly, you may wish to hash your values via set
. Also, you can use map
with str.upper
to convert dataframe values to upper case1:
str_all = set(map(str.upper, df.values.ravel()))
question = "Joe is Available"
str_search = set(question.upper().split())
res = str_search & str_all
# {'JOE'}
1 You can use pd.DataFrame.apply
+ lambda
, but this isn't recommended. String operations via pd.Series.str
are, currently, notoriously slow. Adding a lambda
loop on top makes it worse.
Upvotes: 2
Reputation: 76917
Use
In [741]: [x for x in str_list if x in df.values]
Out[741]: ['JOE']
Upvotes: 1