Reputation: 1932
Assume a DataFrame is created where the number of columns and column names is dynamic. So you could have a DataFrame like:
two = pd.DataFrame({'one' : pd.Series([10, 0, 10], index=['a', 'b', 'c']),
'two' : pd.Series([0, 0, 10.], index=['a', 'b', 'c'])})
one two
a 10 0.0
b 0 0.0
c 10 10.0
Or you could have a Dataframe like:
three = pd.DataFrame({'blue' : pd.Series([10, 0, 10], index=['a', 'b', 'c']),
'red' : pd.Series([0, 0, 10], index=['a', 'b', 'c']),
'two' : pd.Series([0, 0, 10], index=['a', 'b', 'c'])})
blue red two
a 10 0 0
b 0 0 0
c 10 10 10
So you won't know how many columns or the column names until run time. There is no limit on number of columns.
How do you select rows where only one column is greater than zero?
So for a given row if all column values are zero or if more than one column value is greater than zero its excluded from selection.
From the two above examples I'd respectfully output:
one two
a 10 0
and
blue red two
a 10 0 0
Upvotes: 0
Views: 1361
Reputation: 2802
two[two[two > 0].count(axis=1) > 0].head(1)
Say you have
two = pd.DataFrame({'one' : pd.Series([10, 0, 10, 9], index=['a', 'b', 'c', 'd']), 'two' : pd.Series([0, 0, 10., 4.6], index=['a', 'b', 'c', 'd']), 'three' : pd.Series([5, -1, 7, -1], index=['a', 'b', 'c', 'd'])})
Select row where at least one column has value is greater than 0.
head(1)
will select the top row
Reference: pandas: How do I select rows based on if X number of columns is greater than a number?
Upvotes: 1
Reputation:
Check the entire DataFrame for the condition and sum across rows. If that equals 1, the condition holds:
two.loc[(two>0).sum(axis=1)==1]
Out:
one two
a 10 0.0
three.loc[(three>0).sum(axis=1)==1]
Out:
blue red two
a 10 0 0
Or with a lambda:
three.loc[lambda x: (x>0).sum(axis=1)==1]
Out:
blue red two
a 10 0 0
Upvotes: 1