Sai Kumar
Sai Kumar

Reputation: 715

comparing columns in a dataframe for a desired output

Assume this is my input dataframe:

Name  Death1  Return1   Death2  Return2
 A     Yes     Yes       NaN      NaN
 B     No      No        Yes      Yes
 C     Yes     Yes       Yes      Yes
 D     NaN     NaN       NaN      NaN

I'm looking to count the number of times a character is dead and store it in a new column.

# My approach.
def clean_deaths(row):
    num_deaths = 0
    cols = ['Death1', 'Death2'] 
    for c in cols:
        death = row[c]
        if pd.isnull(death) or death == 'NO':
            continue
        elif death == 'YES':
            num_deaths += 1
    return num_deaths

df['Deaths'] = df.apply(clean_deaths, axis=1)

I was not satisfied with my approach. I'd like to see other ways to achieve this.

output: 
Name  Death1  Return1   Death2  Return2  Deaths
 A     Yes     Yes       NaN      NaN      1
 B     No      No        Yes      Yes      1
 C     Yes     Yes       Yes      Yes      2
 D     NaN     NaN       NaN      NaN      0

Upvotes: 1

Views: 38

Answers (1)

jezrael
jezrael

Reputation: 863611

I think need filter columns by names or filter, compare by eq (==) and last sum True values per rows:

df['Deaths'] = df[['Death1', 'Death2']].eq('Yes').sum(axis=1)
print (df)
  Name Death1 Return1 Death2 Return2  Deaths
0    A    Yes     Yes    NaN     NaN       1
1    B     No      No    Yes     Yes       1
2    C    Yes     Yes    Yes     Yes       2
3    D    NaN     NaN    NaN     NaN       0

df['Deaths'] = df.filter(like='Death').eq('Yes').sum(axis=1)

Detail:

print (df[['Death1', 'Death2']].eq('Yes'))
   Death1  Death2
0    True   False
1   False    True
2    True    True
3   False   False

Upvotes: 1

Related Questions