Reputation: 127
I am wanting to check a pandas dataframe to see if two columns match two unique values. I know have to check one column at a time, but not two at once.
Basically, I want to see if the person's last name is 'Smith' and their first name is either 'John' or 'Tom' all at the same time.
My code:
import pandas as pd
# create dataframe
name = {'last_name': ['smith','smith','jones','parker'], 'first_name': ['john','tom','mary','peter']}
df = pd.DataFrame(name,columns=['last_name', 'first_name'])
# this is what I want to do
# df.loc[df['last_name'] == 'smith' and df['first_name'].isin(['john', 'tom']), 'match'] = 'yes'
# this works by itself
df.loc[df['last_name'] == 'smith', 'match'] = 'yes'
# this works by itself
df.loc[df['first_name'].isin(['john', 'tom']), 'match'] = 'yes'
print(df)
Upvotes: 0
Views: 3971
Reputation:
You want to filter rows where the last name is "Smith" AND the first name is either "John" OR "Tom". This means it's either "John Smith" OR "Tom Smith". This is equivalent to
(last_name=="Smith" AND first_name=="John") OR (last_name=="Smith" AND first_name=="Tom")
which is equivalent to:
(last_name=="smith") AND (first_name=='john' OR first_name=='tom')
the latter OR can be handled using isin
:
out = df[(df['last_name']=='smith') & (df['first_name'].isin(['john','tom']))]
Output:
last_name first_name match
0 smith john yes
1 smith tom yes
Upvotes: 1