Reputation: 1280
Using a pandas dataframe, for example:
import pandas as pd
df = pd.DataFrame({'a': [1,0,0], 'b': [1,0,0]})
I have used the answer from Pandas: sum DataFrame rows for given columns to sum the two columns:
foo = df[['a', 'b']].sum(axis=1)
What I'm struggling with now is how to filter the rows that are assigned to foo
. So, for example, I only want the rows that are greater than 0 to be in the result stored in foo
. Does anyone know the best of doing this?
Upvotes: 3
Views: 1645
Reputation: 837
Use Basic You can use basics of Pandas like conditionality
AND dropna
.
df = pd.DataFrame({'a': [1,0,0], 'b': [1,0,0]})
foo = df[['a', 'b']].sum(axis=1)
foo = pd.DataFrame(foo) # Converting foo into DataFrame
foo = foo[foo > 0] # Applying the conditionality search
foo.dropna(axis=0, inplace=True) # Droping the NaN values
foo.columns = ['Result'] # Changeing the name of column
foo
Output
Result
0 2.0
I hope it may help you.
Upvotes: 1
Reputation: 863226
Use:
foo = df[['a', 'b']]
mask = foo.gt(0).all(axis=1)
out = foo[mask].sum(axis=1)
print (out)
0 2
dtype: int64
Details:
Compare by DataFrame.gt
(>
) for greater values:
print (foo.gt(0))
a b
0 True True
1 False False
2 False False
And then test if DataFrame.all
values per rows are True
, also is possible use DataFrame.any
if need test at least one True
, it means here one greater value per row:
print (foo.gt(0).all(axis=1))
0 True
1 False
2 False
dtype: bool
But if want filter by foo
use boolean indexing
and because same index in foo
and df
create mask by foo
and filter original DataFrame
:
foo = df[['a', 'b']].sum(axis=1)
df = df[foo.gt(0)]
print (df)
a b
0 1 1
Detail:
print (foo.gt(0))
0 True
1 False
2 False
dtype: bool
Upvotes: 1