Reputation: 1539
I have a 2D grid representing a set of pixels. For each pixel, I have the coordinates of the top left corner.
I also have a very long list of randomly distributed 2D points. I am looking for an efficient way of finding the indices of the points present in each pixel.
For the moment I have the following:
import numpy as np
xgrid = np.linspace(0,10,11)
ygrid = np.linspace(0,10,11)
X_random_points = np.random.rand(int(1e7))*10
Y_random_points = np.random.rand(int(1e7))*10
for iterationX in range(0,len(xgrid)-1):
for iterationY in range(0,len(ygrid)-1):
valuesInCube = (X_random_points<xgrid[iterationX]) & (X_random_points>xgrid[iterationX-1]) & (Y_random_points<ygrid[iterationY]) &(Y_random_points>ygrid[iterationY-1])
I was wondering if someone had an idea of how to make this quicker?
Upvotes: 1
Views: 260
Reputation: 10590
I can give you a related approach that might still be useful. You can instead find which pixel each point belongs to. The functions numpy.digitize
and scipy.stats.binned_statistic_2d
are useful here. scipy.stats.binned_statistic_2d
feels a little clumsy since it does more than just bin your points and requires you to give some values for each of your x,y points.
You should note that the bin numbering starts counting at 1
(not 0
).
x_p, y_p = np.digitize(X_random_points, xgrid), np.digitize(Y_random_points, xgrid)
# OR #
_, _, _, (x_p, y_p) = scipy.stats.binned_statistic_2d(X_random_points, Y_random_points, np.zeros(len(X_random_points)), bins=(xgrid, ygrid), expand_binnumbers=True)
For a particular pixel, you can even find all points that belong to that pixel from x_p
and y_p
.
Upvotes: 1
Reputation: 114320
You can use np.floor
to vectorize the whole operation and avoid looping entirely, as long as the separation between pixels is even in each direction. For your simple case, where xgrid
and ygrid
are integers, you can just do
X_random_points = ...
Y_random_points = ...
x_pixels = np.floor(X_random_points)
y_pixels = np.floor(Y_random_points)
If your pixels are not on an integer grid, you have to know the separation between them. In this case, I would recommend using np.arange
instead of np.linspace
to generate pixel locations:
delta_x, delta_y = 0.5, 0.5
xgrid = np.arange(0, 5.1, delta_x)
ygrid = np.arange(0, 5.1, delta_y)
X_random_points = np.random.rand(int(1e7)) * 5
Y_random_points = np.random.rand(int(1e7)) * 5
x_pixels = np.floor(X_random_points / delta_x)
y_pixels = np.floor(Y_random_points / delta_y)
You are really doing the same thing for the integer case, since both delta_x
and delta_y
are both 1
then.
Upvotes: 1