bibscy
bibscy

Reputation: 2708

how to return boolean series based on multiple conditions in pandas?

I would like to return a boolean series based on multiple conditions and then subset that in the initial dataframe.

This is returning a dataframe type rather than a boolean series.

#remove outliers
minInCollection = myDataFrame[  
    (myDataFrame.Age>myDataFrame.Age.min()) & 
    (myDataFrame.Age<myDataFrame.Age.max()) &
    (myDataFrame.Paid_Off_In_Days>myDataFrame.Paid_Off_In_Days.min()) &
    (myDataFrame.Paid_Off_In_Days<myDataFrame.Paid_Off_In_Days.max())
    ]

print("type is ", str(type(minInCollection)))
  <class 'pandas.core.frame.DataFrame'>

Upvotes: 1

Views: 2155

Answers (1)

a11
a11

Reputation: 3396

You are very close--if you want the Boolean series returned, you need to drop the brackets. See the example code below and this simple tutorial here.

### Make up data
colA = [20,30,17,30,22,27,30,24]
myDataFrame = pd.DataFrame(list(zip(colA)), columns =['Age']) 

minInCollection = (myDataFrame.Age>myDataFrame.Age.min()) & (myDataFrame.Age<myDataFrame.Age.max())
display(minInCollection)
print(type(minInCollection))

enter image description here

Upvotes: 1

Related Questions