Select rows where two or more columns are bigger than 0 in pandas

Question

I am working with a dataframe in pandas. My dataframe had 55 columns and 70.000 rows.

How can I select the rows where two or more values are bigger than 0?

It now looks like this:

   A   B  C   D   E
a  0   2  0   8   0
b  3   0  0   0   0
c  6   2  5   0   0

And would like to make this:

   A   B  C   D   E  F
a  0   2  0   8   0  true
b  3   0  0   0   0  false
c  6   2  5   0   0  true

Have tried converting it to just 0's and 1's and summing that, like so:

df[df > 0] = 1

df[(df > 0).sum(axis=1) >= 2]

But then I lose all the other info in the dataframe and I still want to be able to see the original values.

U13-Forward · Accepted Answer

Try assigning to a column like this:

>>> df['F'] = df.gt(0).sum(axis=1).ge(2)
>>> df
   A  B  C  D  E      F
a  0  2  0  8  0   True
b  3  0  0  0  0  False
c  6  2  5  0  0   True

Or try with astype(bool):

>>> df['F'] = df.astype(bool).sum(axis=1).ge(2)
>>> df
   A  B  C  D  E      F
a  0  2  0  8  0   True
b  3  0  0  0  0  False
c  6  2  5  0  0   True
>>>

Select rows where two or more columns are bigger than 0 in pandas

Answers (2)

Related Questions