Reputation: 1460
Lets say I have the following data set:
A B
10.1 53
12.5 42
16.0 37
20.7 03
25.6 16
30.1 01
40.9 19
60.5 99
I have a the following list of ranges.
[[9,15],[19,22],[39,50]]
How do I efficiently pull rows that lie in those ranges?
Wanted Output
A B
10.1 53
12.5 42
20.7 03
40.9 19
Edit: Needs to work for floating points
Upvotes: 1
Views: 156
Reputation: 386
Here's another method (edit: works with floats or integers). @jpp's might be faster, but this code is easier to understand (in my opinion).
df = pd.DataFrame([[10.1,53],[12.5,42],[16.0,37],[20.7,3],[25.6,16],[30.1,1],[40.9,19],[60.5,99]],columns=list('AB'))
ranges = [[9,15],[19,22],[39,50]]
result = pd.DataFrame(columns=list('AB'))
for r in ranges:
result = result.append(df[df['A'].between(r[0], r[1], inclusive=False)])
print (result)
Here's the output:
A B
0 10.1 53
1 12.5 42
3 20.7 3
6 40.9 19
PS: the following one-line list comprehension also works:
result = result.append([source[source['A'].between(r[0], r[1], inclusive=False)] for r in ranges])
Upvotes: 0
Reputation: 164773
Update for modified question
For floats, you can construct a mask using NumPy array operations:
L = np.array([[9,15],[19,22],[39,50]])
A = df['A'].values
mask = ((A >= L[:, 0][:, None]) & (A <= L[:, 1][:, None])).any(0)
res = df[mask]
print(res)
A B
0 10.1 53
1 12.5 42
3 20.7 3
6 40.9 19
Previous answer to original question
For integers, you can use numpy.concatenate
with numpy.arange
:
L = [[9,15],[19,22],[39,50]]
vals = np.concatenate([np.arange(i, j) for i, j in L])
res = df[df['A'].isin(vals)]
print(res)
A B
0 10 53
1 12 42
3 20 3
6 40 19
An alternative solution with itertools.chain
and range
:
from itertools import chain
vals = set(chain.from_iterable(range(i, j) for i, j in L))
res = df[df['A'].isin(vals)]
Upvotes: 1