Sam
Sam

Reputation: 346

Return keyword of string if found in dataframe columns

I have a string. I need to find if the keywords in that string is present in my dataframe.

If present, I need to return that keyword back.

String:

question="Joe is Available"
question=question.upper()
str_list=question.split()
str_list

Out[107]:

['JOE', 'IS', 'AVAILABLE']

Dataframe:

df=pd.DataFrame({"Person1":("Ash","Joe","Harry"),"Person2":("Abe","Lisa","Katty",),"Person3":("Sam","Max","Stone")})
df=df.apply(lambda x: x.astype(str).str.upper())


Person1 Person2 Person3
ASH     ABE     SAM
JOE     LISA    MAX
HARRY   KATTY   STONE

My Attempt:

return_field=""
for x in str_list:
    print(x)
    for i in df.iterrows():
        if(df.str.contains(x)):
            return_field=x

Gives me AttributeError: 'DataFrame' object has no attribute 'str'

Expected Output

Since Joe is present in the dataframe, it should return me back "Joe"

Upvotes: 1

Views: 60

Answers (2)

jpp
jpp

Reputation: 164613

If you do this repeatedly, you may wish to hash your values via set. Also, you can use map with str.upper to convert dataframe values to upper case1:

str_all = set(map(str.upper, df.values.ravel()))

question = "Joe is Available"
str_search = set(question.upper().split())

res = str_search & str_all

# {'JOE'}

1 You can use pd.DataFrame.apply + lambda, but this isn't recommended. String operations via pd.Series.str are, currently, notoriously slow. Adding a lambda loop on top makes it worse.

Upvotes: 2

Zero
Zero

Reputation: 76917

Use

In [741]: [x for x in str_list if x in df.values]
Out[741]: ['JOE']

Upvotes: 1

Related Questions