user1140126
user1140126

Reputation: 2649

Delete all rows containing all zeros after a certain column number

In pandas dataframe how do I delete all rows which have zeros after a certain column. For example

from pandas import DataFrame
df = DataFrame({'a' : [0,1,1,0,0,0,0], 'b' : [0,1,-1, 1,0,0,0], 'c': [1,4,5,6,7,0,0]}).T

df:

    0   1   2   3   4   5   6
a   0   1   1   0   0   0   0
b   0   1   -1  1   0   0   0
c   1   4   5   6   7   0   0

How do I drop rows containing all values as zero after column 3? The first and second rows (index a and b) in this example are to be dropped.

Upvotes: 1

Views: 1081

Answers (3)

ssm
ssm

Reputation: 5383

In case you have an arbitrary number of columns, you cna always do:

df[ df.ix[:, 4:].T.abs().sum() != 0 ]

Upvotes: 2

EdChum
EdChum

Reputation: 394389

You can subscript the columns, replace 0 with NaN, drop any rows that don't have at least 1 non NaN value and use loc on the index:

In [63]:
df.loc[df[df.columns[4:]].replace(0, NaN).dropna(thresh=1).index]
Out[63]:
   0  1  2  3  4  5  6
c  1  4  5  6  7  0  0

So breaking this down:

In [64]:
df[df.columns[4:]]

Out[64]:
   4  5  6
a  0  0  0
b  0  0  0
c  7  0  0

In [66]:   
df[df.columns[4:]].replace(0, NaN)

Out[66]:
    4   5   6
a NaN NaN NaN
b NaN NaN NaN
c   7 NaN NaN

In [67]:    
df[df.columns[4:]].replace(0, NaN).dropna(thresh=1)

Out[67]:
   4   5   6
c  7 NaN NaN

In [68]:    
df[df.columns[4:]].replace(0, NaN).dropna(thresh=1).index

Out[68]:
Index(['c'], dtype='object')

Update Actually a more concise way:

In [77]:

df[any(df[df.columns[4:]] != 0, axis=1)]
Out[77]:
   0  1  2  3  4  5  6
c  1  4  5  6  7  0  0

Upvotes: 2

Amit
Amit

Reputation: 20506

df[(df[4] != 0) | (df[5] != 0) | (df[6] != 0)] 

Upvotes: 0

Related Questions