Pythonuser
Pythonuser

Reputation: 213

How to filter DF based on multiple conditions

I have a df that I am trying to filter, using multiple conditions

remove_outliers[remove_outliers['outlier_residual'] > (Q3 + 1.5 * IQR) and remove_outliers['season'] =='Autumn']

when i try this i get the following error

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-304-141eedb8a594> in <module>
----> 1 remove_outliers[remove_outliers['outlier_residual'] > (Q3 + 1.5 * IQR) and remove_outliers['season'] =='Autumn']

~\AppData\Roaming\Python\Python37\site-packages\pandas\core\generic.py in __nonzero__(self)
   1328     def __nonzero__(self):
   1329         raise ValueError(
-> 1330             f"The truth value of a {type(self).__name__} is ambiguous. "
   1331             "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
   1332         )

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

what is the correct way to do this? appreciate any help or advice

Upvotes: 0

Views: 65

Answers (2)

Raghav Sharma
Raghav Sharma

Reputation: 195

remove_outliers.loc[(remove_outliers['outlier_residual'] > (Q3 + 1.5 * IQR)) & (remove_outliers['season'] =='Autumn')]

And their is no need to nest .loc inside .loc

Upvotes: 1

Nuno B. Brandao
Nuno B. Brandao

Reputation: 76

I guess you missing a pair of brackets. Let me know whether it works now:

remove_outliers.loc[(remove_outliers.loc[:,'outlier_residual'] > (Q3 + 1.5 * IQR)) & remove_outliers.loc[:,'season'] =='Autumn'),:]

P.S I have used .loc for good practice purpose

Upvotes: 0

Related Questions