muazfaiz
muazfaiz

Reputation: 5031

Filter DataFrame in Pandas on sum of rows

I have a dataframe

[1] df
ProductIds  A   B   C   D
11210000018 0   0   0   0
11210000155 1   0   0   0
11210006508 0   0   0   0
11210007253 0   0   0   0
11210009431 0   0   0   0
11210135871 1   0   0   0

I want to filter the frame by adding each row and if sum is greater than zero then filter that row. For the given condition the result would be like

ProductIds  A   B   C   D
11210000155 1   0   0   0
11210135871 1   0   0   0

One way of doing that is to add another column with sum and then filter like the following:

df['Sum'] = df.sum(axis = 1)
df = df[df.Sum > 0]
df.drop(['Sum']

But is there any one liner builtin method to do this ? I cannot add the columns manually because there are thousands of columns. Thanks.

Upvotes: 1

Views: 6287

Answers (2)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210902

another solutions:

In [194]: df.query('A + B + C + D > 0')
Out[194]:
             A  B  C  D
ProductIds
11210000155  1  0  0  0
11210135871  1  0  0  0

Upvotes: 0

jezrael
jezrael

Reputation: 863216

I think you can use DataFrame.all if in DataFrame are only 0 and numbers higher as 0 - test if in row are all values 0 and then use boolean indexing:

mask = (df == 0).all(axis=1)
print (mask)
ProductIds
11210000018     True
11210000155    False
11210006508     True
11210007253     True
11210009431     True
11210135871    False
dtype: bool

print (df[~mask])
             A  B  C  D
ProductIds             
11210000155  1  0  0  0
11210135871  1  0  0  0

More general solution is use boolean mask in boolean indexing - is not neccessary create new column:

df = df[df.sum(axis = 1) > 0]

Upvotes: 3

Related Questions