Reputation: 9345
I have a Pandas DataFrame called df
, containing a column called _type
and one called avg_engaged_time
. I want to look at the rows where _type
is 0
and avg_engaged_time
is between the 5th and 95th percentile. Here's my attempt so far:
First I filter based on _type
:
original = result_df[result_df['_type'] == 0.0]
Then I find the percentiles:
low_original = original['_avg_engaged_time'].quantile(0.05)
high_original = original['_avg_engaged_time'].quantile(0.95)
Then I try to filter based on these percentiles:
original[original['_avg_engaged_time'] > low_original and original['_avg_engaged_time'] < high_original]
Unfortunately, I get this error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I just want to use basic Boolean indexing to filter out rows that have an _avg_engaged_time
less than the 5th percentile or greater than the 95th percentile...
Any ideas how to fix?
Thanks!
Upvotes: 1
Views: 627
Reputation: 2454
You should use bitwise operator &
instead of and
. You are doing a logic operation between a list of boolean values, not on single ones.
so
original[(original['_avg_engaged_time'] > low_original) & (original['_avg_engaged_time'] < high_original)]
should work.
Upvotes: 3