Scipy KDTree() Get rectagular shaped neighbouring grid points

Question

I've come across a small problem while using this module. In fact, the module does exactly what I'm asking him to do... which is finding all the nearest grid points for given coordinates among this grid.

But, when the given coordinates are very close to a point of the grid and the grid has longer steps on one side, it gives something like :

scipy.spatial.KDtree results

So in this image, the point to calculate nearest neighbor is the red dot you can see in the bottom left corner. The results given by KDTree are the blue squares. The green diamond is the 4th point I would like to get instead of the lone blue one at the top of the image.

Code :

>>> grid.head()
          x         y
0  0.000000 -9.490125
1  0.959131 -9.490125
2  1.918263 -9.490125
3  2.877394 -9.490125
4  3.836526 -9.490125

>>> pt
[4.0092010999999998e-05, -9.4901299629261011]

>>>tree = ssp.KDTree(grid)
>>>dis, idx = tree.query(pt,4)

>>> idx 
array([  0,  71,   1, 142])

>>> grid.iloc[idx]
            x         y
0    0.000000 -9.490125
71   0.000000 -8.980481
1    0.959131 -9.490125
142  0.000000 -8.470837

Question:

Is there a way to specify that we want a rectangle shaped array in the query or something? Maybe by specifying that we only want 2 y's for one x?

keepAlive · Accepted Answer

First, let us try to create a Minimal, Complete, and Verifiable example

>>> import pandas as pd
>>> import numpy as np
>>> x0, dx = 0, 0.959131
>>> x  = np.arange(x0, x0+5*dx,dx) 
>>> y0, dy = -9.4901299629261011, 8.980481-8.470837
>>> y  = np.arange(y0, y0+2*dy,dy)
>>> data = np.transpose([np.tile(x, len(y)), np.repeat(y, len(x))])
>>> grid = pd.DataFrame(data=data, columns=['x', 'y'])
>>> grid.head()
          x        y
0  0.000000 -9.49013
1  0.959131 -9.49013
2  1.918262 -9.49013
3  2.877393 -9.49013
4  3.836524 -9.49013

where grid.head() is based on the numerical equivalent of grid's graphic representation

>>> grid
           x         y
0   0.000000 -9.490130 # the red dot
1   0.959131 -9.490130 # the bottom right blue square
2   1.918262 -9.490130
3   2.877393 -9.490130
4   3.836524 -9.490130
5   0.000000 -8.980486 # the middle left blue square
6   0.959131 -8.980486 # the green diamond
7   1.918262 -8.980486
8   2.877393 -8.980486
9   3.836524 -8.980486
10  0.000000 -8.470842 # the unwanted top left blue square
11  0.959131 -8.470842
12  1.918262 -8.470842
13  2.877393 -8.470842
14  3.836524 -8.470842

Thus, you want the points 1, 5 and 6 as neighborhood of point 0.

To do so, you may want to have a look at the function kneighbors_graph of the sklearn.neighbors module which implements the k-nearest neighbors algorithm. Playing with it, and setting the power parameter for the Minkowski metric, p, greater than 2, say 3 (the idea of taking p>2 is basically to reduce the euclidean squareroot-of-2 factor -- between diagonals and sides in a unit square -- toward 1), as follows

>>> from sklearn.neighbors import kneighbors_graph
>>> _3n_graph = kneighbors_graph(grid,
                                 n_neighbors=3,
                                 p=3,
                                 mode='connectivity',
                                 include_self=False)

yields

>>> grid.iloc[_3n_graph[0].indices]
          x         y
5  0.000000 -8.980486
1  0.959131 -9.490130
6  0.959131 -8.980486

Scipy KDTree() Get rectagular shaped neighbouring grid points

Question:

Answers (1)

Related Questions