Reputation: 27
I have a code that contains two columns related to gender for an individual. I would like to see if there are any discrepancies as well as find missing values. I am unfamiliar with how to do this. Any help or resources you can provide would be greatly appreciated as I am still new. If possible I would also like to create a third column that states the outcome.
d = {'First': ['Male', 'Female', '', 'Female', 'Male', ''],
'Second': ['Male', 'Male', '', 'Female', '', 'Male']
df = pd.DataFrame(data = d)
#possible output column
df['Third'] = ['match', 'discrepancy', 'missing', 'match', 'discrepancy', 'discrepancy']
Upvotes: 1
Views: 748
Reputation: 3730
Solve by using nested numpy np.where()
:
df['Third'] = np.where(df.First == df.Second, np.where(df.First.str.len() > 0, 'match', 'missing'), 'discrepancy')
print(df)
Output:
First Second Third
0 Male Male match
1 Female Male discrepancy
2 missing
3 Female Female match
4 Male discrepancy
5 Male discrepancy
Upvotes: 1
Reputation: 1475
Define a function, apply it to each row and save the result in the "Third" column
def compare(row):
first = row["First"]
second = row["Second"]
if len(first) == 0 and len(second) == 0:
return "missing"
if pd.isna(first) and pd.isna(second):
return "missing"
if first != second:
return "discrepancy"
return "match"
df["Third"] = df.apply(compare, axis=1)
df
axis=1
applies the compare
function row wise. The default is axis=0 which sends each column to the function.
Upvotes: 1
Reputation: 35686
You can apply a function to every row using apply on axis=1.
You can adjust this function to return any value you would like based on your conditions.
import pandas as pd
d = {'First': ['Male', 'Female', '', 'Female', 'Male', ''],
'Second': ['Male', 'Male', '', 'Female', '', 'Male']}
df = pd.DataFrame(data=d)
def eval_row(r):
if not r['First'] and not r['Second']:
return 'missing'
elif r['First'] == r['Second']:
return 'match'
else:
return 'discrepancy'
df['Third'] = df.apply(eval_row, axis=1)
print(df)
Produces:
First Second Third
0 Male Male match
1 Female Male discrepancy
2 missing
3 Female Female match
4 Male discrepancy
5 Male discrepancy
Upvotes: 2