Zack Eriksen
Zack Eriksen

Reputation: 301

How do I check if each row of a given DataFrame column falls between a specified range?

I have a large DataFrame (over a million columns) and I would like to keep only those columns whose values fall between two numbers, but the specified range is different for each row. I was thinking of defining two separate Series (an upper limit and a lower limit) to use for the comparison, but I don't know the most efficient way to do this. For example, if a is a single column from my large DataFrame, I only want to keep it if the value in each row falls between a_high and a_low. Below, a would meet the criteria for success.

a = pd.Series([1,4,5,2,3,3,5,7])

a_high = pd.Series([2,4,6,2,4,4,6,8])
a_low = pd.Series([0,2,4,0,2,2,4,6])

Because my DataFrame is so big, I'm trying to avoid looping through each row. Do you have any suggestion? I was wondering if df.apply() or list comprehension may help here. Thanks!

-Zack

Upvotes: 0

Views: 293

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150825

Try between:

df[df['a'].between(a_low, a_high)]

Upvotes: 1

Related Questions