Reputation: 2183

How to remove rows from a DataFrame where some columns only have zero values

I have the following Pandas DataFrame in Python:

import numpy as np
import pandas as pd
df  = pd.DataFrame(np.array([[1, 2, 3, 4, 5, 6], [11, 22, 33, 44, 55, 66], 
                             [111, 222, 0, 0, 0, 0], [1111, 0, 0, 0, 0, 0]]),
                   columns=['a', 'b', 'c', 'd', 'e', 'f'])

DataFrame looks as the following in a table:

    a      b     c    d     e    f
0   1      2     3    4     5    6
1   11     22    33   44    55   66
2   111    222   0    0     0    0
3   1111   2222  0    0     0    0

The original DataFrame is much bigger than this. As seen, some rows have zero values in some columns (c, d, e, f).

I need to remove these columns from the DataFrame so that my new DataFrame will look as the following (after removing rows where given columns are zeros only):

    a      b     c    d     e    f
0   1      2     3    4     5    6
1   11     22    33   44    55   66

And I only need to remove the rows where all these column (c, d, e, and f) are zeros. If, for example, 2 of them are 0, then I will not remove such rows.

Is there a good way of doing this operation without looping through the DataFrame?

Upvotes: 1

Answers (4)

geekzeus

Reputation: 895

with operators

df.loc[~((((df['c'] == 0) & (df['d'] == 0)) & (df['e'] == 0)) & (df['f'] == 0))]

Upvotes: 2

mosc9575

Reputation: 6337

Here is one more option: Use df.query() with an self defined query.

my_query = '~('+'and '.join([f'{name}==0' for name in 'cdef'])+')'
df.query(my_query)

If you print my_query it is easy to read: ~(c==0 and d==0 and e==0 and f==0) with ~ means 'not'.

Upvotes: 1

thorntonc

Reputation: 2126

Row filtering on selected columns, any have zeroes with any:

import numpy as np
import pandas as pd

df  = pd.DataFrame(np.array([[1, 2, 3, 4, 5, 6], [11, 22, 33, 44, 55, 66],
                             [111, 222, 0, 0, 0, 0], [1111, 0, 0, 0, 0, 0]]),
                   columns=['a', 'b', 'c', 'd', 'e', 'f'])

df = df[(df[['c', 'd', 'e', 'f']] != 0).any(axis=1)]

print(df)

Output:

    a   b   c   d   e   f
0   1   2   3   4   5   6
1  11  22  33  44  55  66

Upvotes: 3

sushanth

Reputation: 8302

try this,

df[~df[list('cdef')].eq(0).all(axis = 1)]

    a   b   c   d   e   f
0   1   2   3   4   5   6
1  11  22  33  44  55  66

Upvotes: 1

How to remove rows from a DataFrame where some columns only have zero values

Answers (4)

Related Questions