Seema Mudgil
Seema Mudgil

Reputation: 385

Extracting column names from a dataframe based on condition on row value

I have a dataframe

     A     B     C
 u1  0     .5    .2
 u2  .2     0     .3
 u3   .1    0     0

I need to find column names against each index where values is not zero So i need output

        elements 
    u1  [B,C]
    u2  [A,C]
    u3  [A]

I can find column name of top value across rows using df.idxmax(axis=1) but how to find all names of column.

Upvotes: 2

Views: 3100

Answers (1)

jezrael
jezrael

Reputation: 862511

You can use apply with axis=1 for process by rows and filter by convert values to bool - 0 is False, not 0 is True:

df = df.apply(lambda x: x.index[x.astype(bool)].tolist(), 1)
print (df)
u1    [B, C]
u2    [A, C]
u3       [A]
dtype: object

If output should be strings:

s = np.where(df, ['{}, '.format(x) for x in df.columns], '')
df = pd.Series([''.join(x).strip(', ') for x in s], index=df.index)
print (df)
u1    B, C
u2    A, C
u3       A
dtype: object

Detail:

print (s)
[['' 'B, ' 'C, ']
 ['A, ' '' 'C, ']
 ['A, ' '' '']]

Upvotes: 7

Related Questions