user2741831
user2741831

Reputation: 2410

Check if a number in a list is within any list of ranges

I have a list of numbers and a list of ranges. I would like to efficently check for every number if it is with any of the ranges in a resulting boolean array. Like this

a=[1,2,3,4,7,8,9]
b=[(0,3),(8,10)]
f(a,b)=>[True,True,True,False,False,True,True]

Upvotes: 1

Views: 76

Answers (4)

Paul Panzer
Paul Panzer

Reputation: 53119

If the intervals are non overlapping np.searchsorted should be rather efficient (np.nextafter(b,b+np.arange(-1,1)).ravel().searchsorted(a)&1).astype(bool) # array([ True, True, True, False, False, True, True])

Timings using @Divakar's benchit:

enter image description here

Code for making the plot:

import benchit
    
import numpy as np
import pandas as pd

def pp(ab):
    a,b=ab
    return (np.nextafter(b,b+np.arange(-1,1)).ravel().searchsorted(a)&1) \
        .astype(bool)

def dv(ab):
    a,b=ab
    L = max(np.max(a), max(max(b))+1)+1
    mask = np.zeros(L, dtype=bool)
    for (i,j) in b:
        mask[i:j+1] = 1
    return mask[a]

def ys(ab):
    a,b=ab
    [any(y in x for x in pd.IntervalIndex.from_tuples(b,closed='both')) for y in a ]

def cn(ab):
    a,b=ab
    return [
        any(low <= i <= high for low, high in b)
        for i in a
    ]
    
def make(n):
    b = np.random.randint(1,11,(n//10*2)).cumsum().reshape(-1,2)
    b = [(x,y) for x,y in b.tolist()]
    a = np.random.randint(0,n,n//3).tolist()
    return a,b
    
in_ = {n:make(n) for n in [10,20,50,100,200,500,1000]}
funcs = [pp,dv,ys,cn]
t = benchit.timings(funcs, in_)
t.rank()
t.plot(logx=True, save='timings.png')

Upvotes: 1

Divakar
Divakar

Reputation: 221754

Here's one with masking -

L = max(np.max(a), max(max(b))+1)+1
mask = np.zeros(L, dtype=bool)
for (i,j) in b:
    mask[i:j+1] = 1
out = mask[a]

Upvotes: 0

Cohan
Cohan

Reputation: 4564

Here's a pure python version. Obviously you can set <= to be < if you want exclusive ranges.

a = [1, 2, 3, 4, 7, 8, 9]
b = [(0, 3), (8, 10)]

result = [
    any(low <= i <= high for low, high in b)
    for i in a
]
# [True, True, True, False, False, True, True]

Upvotes: 2

BENY
BENY

Reputation: 323396

We can pass IntervalIndex

[any(y in x for x in pd.IntervalIndex.from_tuples(b,closed='both')) for y in a ]
Out[48]: [True, True, True, False, False, True, True] 

Upvotes: 3

Related Questions