Reputation: 71
I'm using scipy.spatial.cKDTree.query_ball_point to get the number of data points within a specific radius from each point in a grid layout.
It works but returns me an array of lists, and I only need the length of each list. Of course I can iterate through the array, but there must be a smart way to get the length of each list in an array of lists, or maybe another way to find the number of data points that are within a specific radius from each grid point.
Any ideas of how to do this the most efficient?
Upvotes: 0
Views: 121
Reputation: 231605
The docs for this function say it returns a
If x is an array of points, returns an object array of
shape tuple
containing lists of neighbors
where
x : array_like,
shape tuple
+ (self.m,)
The talk of shape tuple
is a little unclear, but I think it refers x.shape[:-1]
, all but the last dimension of the input array. So for n
points in a 2d space, x
will be (n,2)
, and the result will be shape (n,).
For a simple 1d array of lists, just plain list comprehension is the best way:
In [36]: x=np.array([[1,2,3],[],[3,4]])
In [37]: x
Out[37]: array([[1, 2, 3], [], [3, 4]], dtype=object)
In [39]: [len(i) for i in x]
Out[39]: [3, 0, 2]
len(x)
and x.shape
apply to the array itself, not any elements.
x
contains pointers to the lists; so any operation on those lists requires a Python access to those lists. There aren't many vectorized array operations that propagate down to the elements of an object array. After all the elements of such an array may be anything, including None
.
If you input array is higher dimensional, e.g. (10,20,2)
, a 10x20 grid of points, it's probably easiest to flatten this first.
In [50]: X
Out[50]:
array([[[1, 2, 3], [1]],
[[1, 2, 3], [3, 4]]], dtype=object)
In [51]: np.array([len(i) for i in X.flat]).reshape(2,2)
Out[51]:
array([[3, 1],
[3, 2]])
In sum - list comprehension is the way to go, even though it is an array.
===============
There is another way of iterating over an array that handles multidimensions well. In some tests it may save 20% over list iterations, the use of np.frompyfunc
.
np.frompyfunc(len,1,1)(x).astype(int)
It returns an array of the right shape, though it too is dtype object, hence the astype
tag. np.vectorize
uses this, but makes no claim to improving speed.
Upvotes: 1
Reputation: 10759
Sounds like you are looking for this: scipy.spatial.cKDTree.count_neighbors
Upvotes: 0
Reputation: 333
You can also use a more ''mathematical'' way:
lengths = map(len, myarray)
It will return a map
object you can iterate.
Upvotes: 1
Reputation: 34387
Call len() on each element in the array
ie
lengths=[len(x) for x in myarray]
Upvotes: 2