Reputation: 431
I'm looking for the fastest way to select the elements of a numpy array that satisfy several criteria. As an example, say I want to select all elements that lie between 0.2 and 0.8 from an array. I normally do something like this:
the_array = np.random.random(100000)
idx = (the_array > 0.2) * (the_array < 0.8)
selected_elements = the_array[idx]
However, this creates two additional arrays with the same size as the_array (one for the_array > 0.2 and one for the_array < 0.8). If the array is large, this can consume a lot of memory. Is there any way to get around this? All of the built-in numpy functions (such as logical_and) seem to do the this same thing under the hood.
Upvotes: 1
Views: 407
Reputation: 7842
You could implement a custom C call for the select. The most basic way to do this is through a ctypes
implementation.
select.c
int select(float lower, float upper, float* in, float* out, int n)
{
int ii;
int outcount = 0;
float val;
for (ii=0;ii<n;ii++)
{
val = in[ii];
if ((val>lower) && (val<upper))
{
out[outcount] = val;
outcount++;
}
}
return outcount;
}
which is compiled as:
gcc -lm -shared select.c -o lib.so
And on the python side:
select.py
import ctypes as C
from numpy.ctypeslib import as_ctypes
import numpy as np
# open the library in python
lib = C.CDLL("./lib.so")
# explicitly tell ctypes the argument and return types of the function
pfloat = C.POINTER(C.c_float)
lib.select.argtypes = [C.c_float,C.c_float,pfloat,pfloat,C.c_int]
lib.select.restype = C.c_int
size = 1000000
# create numpy arrays
np_input = np.random.random(size).astype(np.float32)
np_output = np.empty(size).astype(np.float32)
# expose the array contents to ctypes
ctypes_input = as_ctypes(np_input)
ctypes_output = as_ctypes(np_output)
# call the function and get the number of selected points
outcount = lib.select(0.2,0.8,ctypes_input,ctypes_output,size)
# select those points
selected = np_output[:outcount]
Don't expect wild speedups with such a vanilla implementation, but in the C side you have the option of adding in OpenMP
pragmas to get quick and dirty parallelism which may give you significant boosts.
Also as mentioned in the comments, numexpr may be a faster neater way to do all this in just a few lines.
Upvotes: 1