Joan Triay
Joan Triay

Reputation: 1658

How to transform multiple boolean columns to one column with column headers and NaN?

I have a dataframe like this:

A      B      C
False  True   False
False  False  False
True   True   False
False  False  True
False  True   False

And I need to obtain something like this:

Total
B  
NaN
A
C
B

I tried with:

df[["A", "B", "C"]].idxmax(1)

but is not returning the NaN, is returning A in place of NaN.

Thanks for any kind of help.

Upvotes: 2

Views: 1111

Answers (3)

Mr. Polywhirl
Mr. Polywhirl

Reputation: 48600

You could easily do this by checking the index of True. You have to be aware that index can throw a ValueError.

data = '''
A      B      C
False  True   False
False  False  False
True   True   False
False  False  True
False  True   False
'''

def safe_index(arr, val):
    try: return arr.index(val)
    except ValueError: return -1

def calculate_totals(data):
    lines = data.strip().split('\n')
    keys = lines[0].split()
    return list(
        map(lambda i: keys[i] if i != -1 else 'NaN',
        [safe_index(line.split(), 'True') for line in lines[1:]]))

if __name__ == '__main__':
    print('\n'.join(['Total'] + calculate_totals(data)))

Upvotes: 0

Rob Raymond
Rob Raymond

Reputation: 31146

Simply check for any() being True

import pandas as pd
import io
import numpy as np

df = pd.read_csv(io.StringIO("""A      B      C
False  True   False
False  False  False
True   True   False
False  False  True
False  True   False"""), sep="\s+")

pd.Series(np.where(df.loc[:,["A","B","C"]].any(axis=1), df.loc[:,["A","B","C"]].idxmax(axis=1), np.nan), name="Total")

output

0      B
1    NaN
2      A
3      C
4      B
Name: Total, dtype: object

Upvotes: 0

anky
anky

Reputation: 75080

Try this, idxmax with series.where and df.any

df.idxmax(1).where(df.any(1))
#or np.where(df.any(1),df.idxmax(1),np.nan)

0      B
1    NaN
2      A
3      C
4      B
dtype: object

Upvotes: 3

Related Questions