snow_fall
snow_fall

Reputation: 225

Finding specific word strings within a pandas column using if/else statements

I'm trying to label a 'description' column based on the strings within it. I am using an if/else statement for this.

Right now it looks like this:

def char_matching(chars):
   if 'software' in chars:
       return 'Software development'
   elif 'Data' in chars:
       return 'Data Science'`

But what if I want to find the words 'data science' together in a column do I write:

elif 'Data-science' in chars:
    return 'Data Science'`

or

elif 'Data|science' in chars:
    return 'Data Science' `

And does this apply to caps lock i.e. 'data' and 'Data'. How do you get over that hurdle?

Upvotes: 1

Views: 154

Answers (1)

jpp
jpp

Reputation: 164773

Strings in your if / else construct may contain spaces.

This will return "Data Science" if "data science" (case insensitive) occurs anywhere in your string variable.

To deal with case insensitivity compare versus chars.lower(). If you do not make this change, then case sensitivity will apply.

def char_matching(chars):
   val = chars.lower()
   if 'software' in val:
       return 'Software development'
   elif 'data science' in val:
       return 'Data Science'
   ...

To test for multiple words, you can use and:

def char_matching(chars):
   val = chars.lower()
   if 'software' in val:
       return 'Software development'
   elif ('data' in val) and ('science' in val):
       return 'Data Science'
   ...

Upvotes: 2

Related Questions