Nandu Menon
Nandu Menon

Reputation: 25

remove rows in dataframe which are not all 1 or all 0

I need to retain rows in the dataframe which has all row values as 0 or all 1.

a = np.repeat(0,10)
b = np.repeat(1,10)
ab = pd.DataFrame({'col1':a,'col2':b}).transpose()

Upvotes: 1

Views: 65

Answers (3)

Nandu Menon
Nandu Menon

Reputation: 25

I am using this presently as it also works I guess..

    Df= Df[(Df.sum(axis=1)==0) | (Df.sum(axis=1)==Df.shape[1])]

Upvotes: 0

gremur
gremur

Reputation: 1690

Possible solution is the following:

# pip install pandas

import pandas as pd

# create test dataframe
df = pd.DataFrame({'col1':[0,0,0,0],'col2':[1,1,1,1],'col3':[0,1,0,1],'col4':['a','b',0,1],'col5':['a','a','a','a']}).transpose()
df

enter image description here

# filter rows of dataframe
df = df[df.eq(0).all(axis=1) | df.eq(1).all(axis=1)]
df

Returns

enter image description here

Upvotes: 3

mozway
mozway

Reputation: 261820

One option, get the diff and ensure the result is always 0:

import numpy as np
np.all(np.diff(ab.values, 1)==0, 1)

Output:

array([ True,  True])

Then use this to slice:

ab[np.all(np.diff(ab.values, 1)==0, 1)]

Other option, use nunique:

ab[ab.nunique(1).eq(1)]

Upvotes: 3

Related Questions