hhbilly
hhbilly

Reputation: 1305

Row-wise extract column names from DataFrame into Series

I'd like to extract the column names in a list to a Series filtered on the values in each row

In [1]: import pandas as pd   

In [2]: df =pd.DataFrame({'colA':[1,0,1], 'colB':[0,0,1], 'colC':[1,0,0]})    

In [3]: print(df)

   colA  colB  colC
0     1     0     1
1     0     0     0
2     1     1     0

The resulting Series should look like this:

0    [colA, colC]
1              []
2    [colA, colB]
dtype: object

Here's the tortured solution I came up with:

In [4]: df2 = df.T

In [5]: l = [df2[df2[i]>0].index.values.tolist() for i in range(3)]

In [6]: print(pd.Series(l))

0    [colA, colC]
1              []
2    [colA, colB]
dtype: object

Is there a less tortured way of doing this?

Upvotes: 2

Views: 70

Answers (1)

yatu
yatu

Reputation: 88305

You could use np.where assuming your dataframe is constituted by 0's and 1's, and create a Series from the result:

x = np.where(df,df.columns,'')
pd.Series([' '.join(i).split() for i in x])
0    [colA, colC]
1              []
2    [colA, colB]

Upvotes: 2

Related Questions