Flag Exact Word Within a String Using String Contains

Question

I have a dataset that looks like this:

ID Symptoms
1  ear, fever
2  hearing loss
3  hurt ear
4  spear wound
5  bad hearing  
6  earring cut

I want to flag only the records where "ear" appears. So for example, the output would look like this:

ID Symptoms         Ear
1  ear, fever        1
2  hearing loss      0
3  hurt ear          1
4  spear wound       0
5  bad hearing       0 
6  earring cut       0

I've played around with some code with little success:

Issue: this code would pull anything with the text "ear"

LABS_TAT.loc[:,"Ear"]=np.where(LABS_TAT["Symptoms"].str.contains("ear", case=False),1,0)

Notice the space after "ear ", this code would not flag the record "hurt ear"

 LABS_TAT.loc[:,"Ear"]=np.where(LABS_TAT["Symptoms"].str.contains("ear ", case=False),1,0)

Notice the space before " ear", this code would not flag the record "ear, fever"

 LABS_TAT.loc[:,"Ear"]=np.where(LABS_TAT["Symptoms"].str.contains(" ear", case=False),1,0)

How can I fix my code so that it flags any records with the word "ear"? I feel like there is a simple answer but I'm still somewhat a newb to python.

Shubham Sharma · Accepted Answer

Use Series.str.contains with a regex pattern:

df['Ear'] = df['Symptoms'].str.contains(r'(?i)\bear\b').astype(int)

Result:

  ID      Symptoms   Ear
0   1    ear, fever    1
1   2  hearing loss    0
2   3      hurt ear    1
3   4   spear wound    0
4   5   bad hearing    0
5   6   earring cut    0

Flag Exact Word Within a String Using String Contains

Answers (2)

Related Questions