strawberrylatte
strawberrylatte

Reputation: 123

drop the row only if all columns contains 0

I am trying to drop rows that have 0 for all 3 columns, i tried using these codes, but it dropped all the rows that have 0 in either one of the 3 columns instead.

indexNames = news[ news['contain1']&news['contain2'] &news['contain3']== 0 ].index
news.drop(indexNames , inplace=True)

My CSV file

contain1  contain2  contain3
   1        0         0
   0        0         0
   0        1         1
   1        0         1
   0        0         0
   1        1         1

Using the codes i used, all of my rows would be deleted. Below are the result i wanted instead

contain1  contain2  contain3
   1        0         0
   0        1         1
   1        0         1
   1        1         1

Upvotes: 2

Views: 221

Answers (4)

jezrael
jezrael

Reputation: 862396

First filter by DataFrame.ne for not equal 0 and then get rows with at least one match - so removed only 0 rows by DataFrame.any:

df = news[news.ne(0).any(axis=1)]
#cols = ['contain1','contain2','contain3']
#if necessary filter only columns by list
#df = news[news[cols].ne(0).any(axis=1)]
print (df)
   contain1  contain2  contain3
0         1         0         0
2         0         1         1
3         1         0         1
5         1         1         1

Details:

print (news.ne(0))
   contain1  contain2  contain3
0      True     False     False
1     False     False     False
2     False      True      True
3      True     False      True
4     False     False     False
5      True      True      True

print (news.ne(0).any(axis=1))
0     True
1    False
2     True
3     True
4    False
5     True
dtype: bool

Upvotes: 2

dzakyputra
dzakyputra

Reputation: 682

You might want to try this.

news[(news.T != 0).any()]

Upvotes: 1

Sabri B
Sabri B

Reputation: 51

A simple solution would be to filter on the sum of your columns. You can do this by running this code news[news.sum(axis=1)!=0]. Hope this will help you :)

Upvotes: 1

Ollie in PGH
Ollie in PGH

Reputation: 2629

If this is a pandas dataframe you can sum the indexes with .sum().

news_sums = news.sum(axis=0)
indexNames = news.loc[news_sums == 0].index
news.drop(indexNames, inplace=True)

(note: Not tested, just from memory)

Upvotes: 1

Related Questions