Anthony Lethuillier
Anthony Lethuillier

Reputation: 1539

Finding efficiently which points are in each pixel with Python

I have a 2D grid representing a set of pixels. For each pixel, I have the coordinates of the top left corner.

I also have a very long list of randomly distributed 2D points. I am looking for an efficient way of finding the indices of the points present in each pixel.

For the moment I have the following:

import numpy as np
xgrid = np.linspace(0,10,11)
ygrid = np.linspace(0,10,11)

X_random_points = np.random.rand(int(1e7))*10
Y_random_points = np.random.rand(int(1e7))*10

for iterationX in range(0,len(xgrid)-1):
    for iterationY in range(0,len(ygrid)-1):
        valuesInCube = (X_random_points<xgrid[iterationX]) & (X_random_points>xgrid[iterationX-1]) & (Y_random_points<ygrid[iterationY]) &(Y_random_points>ygrid[iterationY-1])  

I was wondering if someone had an idea of how to make this quicker?

Upvotes: 1

Views: 260

Answers (2)

busybear
busybear

Reputation: 10590

I can give you a related approach that might still be useful. You can instead find which pixel each point belongs to. The functions numpy.digitize and scipy.stats.binned_statistic_2d are useful here. scipy.stats.binned_statistic_2d feels a little clumsy since it does more than just bin your points and requires you to give some values for each of your x,y points.

You should note that the bin numbering starts counting at 1 (not 0).

x_p, y_p = np.digitize(X_random_points, xgrid), np.digitize(Y_random_points, xgrid)

# OR #

_, _, _, (x_p, y_p) = scipy.stats.binned_statistic_2d(X_random_points, Y_random_points, np.zeros(len(X_random_points)), bins=(xgrid, ygrid), expand_binnumbers=True)

For a particular pixel, you can even find all points that belong to that pixel from x_p and y_p.

Upvotes: 1

Mad Physicist
Mad Physicist

Reputation: 114320

You can use np.floor to vectorize the whole operation and avoid looping entirely, as long as the separation between pixels is even in each direction. For your simple case, where xgrid and ygrid are integers, you can just do

X_random_points = ...
Y_random_points = ...
x_pixels = np.floor(X_random_points)
y_pixels = np.floor(Y_random_points)

If your pixels are not on an integer grid, you have to know the separation between them. In this case, I would recommend using np.arange instead of np.linspace to generate pixel locations:

delta_x, delta_y = 0.5, 0.5
xgrid = np.arange(0, 5.1, delta_x)
ygrid = np.arange(0, 5.1, delta_y)

X_random_points = np.random.rand(int(1e7)) * 5
Y_random_points = np.random.rand(int(1e7)) * 5

x_pixels = np.floor(X_random_points / delta_x)
y_pixels = np.floor(Y_random_points / delta_y)

You are really doing the same thing for the integer case, since both delta_x and delta_y are both 1 then.

Upvotes: 1

Related Questions