rsgmon
rsgmon

Reputation: 1932

Pandas select rows where columns are dynamic and a single column's value is greater than zero

Assume a DataFrame is created where the number of columns and column names is dynamic. So you could have a DataFrame like:

two = pd.DataFrame({'one' : pd.Series([10, 0, 10], index=['a', 'b', 'c']),
   'two' : pd.Series([0, 0, 10.], index=['a', 'b', 'c'])})

 one   two
a   10   0.0
b    0   0.0
c   10  10.0

Or you could have a Dataframe like:

three = pd.DataFrame({'blue' : pd.Series([10, 0, 10], index=['a', 'b', 'c']),
   'red' : pd.Series([0, 0, 10], index=['a', 'b', 'c']),
   'two' : pd.Series([0, 0, 10], index=['a', 'b', 'c'])})

   blue  red  two
a    10    0    0
b     0    0    0
c    10   10   10

So you won't know how many columns or the column names until run time. There is no limit on number of columns.

How do you select rows where only one column is greater than zero?

So for a given row if all column values are zero or if more than one column value is greater than zero its excluded from selection.

From the two above examples I'd respectfully output:

   one  two
a   10    0

and

   blue  red  two
a    10    0    0

Upvotes: 0

Views: 1361

Answers (2)

Devendra Lattu
Devendra Lattu

Reputation: 2802

two[two[two > 0].count(axis=1) > 0].head(1)

Say you have

two = pd.DataFrame({'one' : pd.Series([10, 0, 10, 9], index=['a', 'b', 'c', 'd']), 'two' : pd.Series([0, 0, 10., 4.6], index=['a', 'b', 'c', 'd']), 'three' : pd.Series([5, -1, 7, -1], index=['a', 'b', 'c', 'd'])})

enter image description here enter image description here

Select row where at least one column has value is greater than 0.
head(1) will select the top row

enter image description here
Reference: pandas: How do I select rows based on if X number of columns is greater than a number?

Upvotes: 1

user2285236
user2285236

Reputation:

Check the entire DataFrame for the condition and sum across rows. If that equals 1, the condition holds:

two.loc[(two>0).sum(axis=1)==1]
Out: 
   one  two
a   10  0.0


three.loc[(three>0).sum(axis=1)==1]
Out: 
   blue  red  two
a    10    0    0

Or with a lambda:

three.loc[lambda x: (x>0).sum(axis=1)==1]
Out: 
   blue  red  two
a    10    0    0

Upvotes: 1

Related Questions