MadMax
MadMax

Reputation: 183

efficient way to get all numpy slices for different ranges

I want to slice the same numpy array (data_arra) multiple times to find each time the values in a different range

data_ar shpe: (203,)

range_ar shape: (1000,)

I implemented it with a for loop, but it takes way to long since I have a lot of data_arrays:

#create results array
results_ar = np.zeros(shape=(1000),dtype=object)

i=0
for range in range_ar:
    results_ar[i] = data_ar[( (data_ar>=(range-delta)) & (data_ar<(range+delta)) )].values
    i+=1

so for example:

data_ar = [1,3,4,6,10,12]
range_ar = [7,4,2]
delta= 3

expected output:
(note results_ar shpae=(3,) dtype=object, each element is an array)

results_ar[[6,10];
           [1,3,4,6];
           [1,3,4]]

some idea on how to tackle this?

Upvotes: 2

Views: 251

Answers (2)

Niteya Shah
Niteya Shah

Reputation: 1824

You can use numba to speed up the computations.

import numpy as np
import numba
from numba.typed import List
import timeit

data_ar = np.array([1,3,4,6,10,12])
range_ar = np.array([7,4,2])
delta = 3

def foo(data_ar, range_ar):
    results_ar = list()
    for i in range_ar:
        results_ar.append(data_ar[( (data_ar>=(i-delta)) & (data_ar<(i+delta)) )])

print(timeit.timeit(lambda :foo(data_ar, range_ar)))

@numba.njit(parallel=True, fastmath=True)
def foo(data_ar, range_ar):
    results_ar = List()
    for i in range_ar:
        results_ar.append(data_ar[( (data_ar>=(i-delta)) & (data_ar<(i+delta)) )])

print(timeit.timeit(lambda :foo(data_ar, range_ar)))

15.53519330600102

1.6557575029946747

An almost 9.8 times speedup.

Upvotes: 1

Mad Physicist
Mad Physicist

Reputation: 114440

You could use np.searchsorted like this:

data_ar = np.array([1, 3, 4, 6, 10, 12])
range_ar = np.array([7, 4, 2])
delta = 3

bounds = range_ar[:, None] + delta * np.array([-1, 1])

result = [data_ar[slice(*row)] for row in np.searchsorted(data_ar, bounds)]

Upvotes: 0

Related Questions