Cole
Cole

Reputation: 127

How can I do an if statement on a pandas dataframe to check multiple columns for specific values?

I am wanting to check a pandas dataframe to see if two columns match two unique values. I know have to check one column at a time, but not two at once.

Basically, I want to see if the person's last name is 'Smith' and their first name is either 'John' or 'Tom' all at the same time.

My code:

import pandas as pd

# create dataframe
name = {'last_name': ['smith','smith','jones','parker'], 'first_name': ['john','tom','mary','peter']}
df = pd.DataFrame(name,columns=['last_name', 'first_name'])

# this is what I want to do
# df.loc[df['last_name'] == 'smith' and df['first_name'].isin(['john', 'tom']), 'match'] = 'yes'

# this works by itself
df.loc[df['last_name'] == 'smith', 'match'] = 'yes'

# this works by itself
df.loc[df['first_name'].isin(['john', 'tom']), 'match'] = 'yes'


print(df)

Upvotes: 0

Views: 3971

Answers (1)

user7864386
user7864386

Reputation:

You want to filter rows where the last name is "Smith" AND the first name is either "John" OR "Tom". This means it's either "John Smith" OR "Tom Smith". This is equivalent to

(last_name=="Smith" AND first_name=="John") OR (last_name=="Smith" AND first_name=="Tom")

which is equivalent to:

(last_name=="smith") AND (first_name=='john' OR first_name=='tom')

the latter OR can be handled using isin:

out = df[(df['last_name']=='smith') & (df['first_name'].isin(['john','tom']))]

Output:

  last_name first_name match
0     smith       john   yes
1     smith        tom   yes

Upvotes: 1

Related Questions