Caustic
Caustic

Reputation: 979

How to interpolate using nearest neighbours for high dimension numpy python arrays

I am programming in python using scipy and numpy, I have a look up table of data (LUT) that I access like so:

self.lut_data[n_iter][m_iter][l_iter][k_iter][j_iter][i_iter] 

where I get the *_iter index corresponds to an array of values that I keep in a dictionary. for example, the i_iter index corresponds to wavelength of light so I have a dictionary of lables and values can get by:

labels['wavelength']

and it will return an array of wavelengths that each i_iter corresponds to. This is useful if I use it as a straight look up. If I want the lut_data at 500 nm. I find the corresponding index in labels['wavelength'] first and use that to index the

lut_data[][][][][][wavelength_index]

I do the same for the other dimensions which include things like viewing angles etc they correspond to the other *_iters

What I need to do is find values between the ones in the look up table and I need it to work if I don't know the dimensions of the look up table before hand. If I do, then I know how to solve the problem using a loop for each dimension. But if I don't know how many dimensions the LUT is then I don't know how many loops to nest.

I think I should be able to do it using cKDTree but I cannot get my head around how to make it work. I would really appreciate an example that looks similar to my structures

Thanks

Upvotes: 3

Views: 6508

Answers (2)

user2379410
user2379410

Reputation:

The scipy.interpolate.RegularGridInterpolator would be great for this problem. Though it's only available in Scipy 0.14 (the latest release as of now).

If you have your *_iters in variables you could do:

from scipy.interpolate import RegularGridInterpolator

points = tuple([n_iter, m_iter, l_iter, k_iter, j_iter, i_iter])
interpolator = RegularGridInterpolator(points, lut_data, method='nearest')

Or you can get the points from your dictionary:

keys = ['k1', 'k2', 'k3', 'k4', 'k5', 'wavelength']
points = tuple([labels[key] for key in keys])

If you have the interpolator, you can then use its __call__ method to do an interpolation. This basically means you can call the class instance you created as a function:

point_of interest = tuple([x1, x2, x3, x4, x5, some_wavelength])
interp_value = interpolator(point_of_interest)

The interpolator also allows to interpolate many values at once (i.e. pass a Numpy array of points), which could be significantly efficient if your code requires this.

Upvotes: 1

DrV
DrV

Reputation: 23540

If you have a full array of information to interpolate from, the linear interpolation is not that difficult. It is just slightly time consuming, but if you can fit your array in RAM, it is just a matter of seconds.

The trick is that linear interpolation can be done one axis at a time. So, for each axis:

  • find the nearest points to interpolate between
  • find the relative distance (d = 0..1) between these points, e.g. if you have 540 and 550 nm, and you want to have data at 548 nm, d = 0.8.
  • repeat this procedure for all axes; each round will reduce the number of dimensions by one

Like this:

import numpy as np

def ndim_interp(A, ranges, p):
    # A: array with n dimensions
    # ranges: list of n lists or numpy arrays of values along each dimension
    # p: vector of values to find (n elements)

    # iterate through all dimensions
    for i in range(A.ndim):
        # check if we are overrange; if we are, use the edgemost values
        if p[i] <= ranges[i][0]:
            A = A[0]
            continue
        if p[i] >= ranges[i][-1]:
            A = A[-1]
            continue

        # find the nearest values
        right = np.searchsorted(ranges[i], p[i])
        left = right - 1

        # find the relative distance
        d = (p[i] - ranges[i][left]) / (ranges[i][right] - ranges[i][left])

        # calculate the interpolation
        A = (1 - d) * A[left] + d * A[right]            

    return A

As an example:

# data axis points
arng = [1, 2, 3]
brng = [100, 200]
crng = [540, 550, 560]

# some data
A = np.array([
    [[1., 2., 3.], [2., 3., 4.]],
    [[0.5, 1.5, 2.], [1.5, 2.0, 3.0]],
    [[0., 0.5, 1.], [1., 1., 1.]]])

# lookup:
print ndim_interp(A, (arng, brng, crng), (2.3, 130., 542.))

If you want to do something more complicated (cubic splines, etc.), then you may use scipy.ndimage.interpolation.map_coordinates. Then the recipe changes as follows:

import numpy as np
import scipy.ndimage.interpolation

def ndim_interp(A, ranges, p):
    # A: array with n dimensions
    # ranges: list of n lists or numpy arrays of values along each dimension
    # p: vector of values to find (n elements)

    # calculate the coordinates into array positions in each direction
    p_arr = []
    # iterate through all dimensions
    for i in range(A.ndim):
        # check if we are overrange; if we are, use the edgemost values
        if p[i] <= ranges[i][0]:
            p_arr.append(0)
            continue
        if p[i] >= ranges[i][-1]:
            p_arr.append(A.shape[i] - 1)
            continue

        # find the nearest values to the left
        right = np.searchsorted(ranges[i], p[i])
        left = right - 1

        # find the relative distance
        d = (p[i] - ranges[i][left]) / (ranges[i][right] - ranges[i][left])

        # append the position
        p_arr.append(left + d)

    coords = np.array(p_arr).reshape(A.ndim, -1)
    return scipy.ndimage.interpolation.map_coordinates(A, coords, order=1, mode='nearest')[0]

Of course, there is no point using this with the simplest settings (order=1 equals to linear interpolation), but with even a cubic spline it is not so simple to write your own interpolation algorithm.

Naturally, if your grids are equispaced in all directions, then the code is simpler as there is no need to first interpolate the correct position (a simple division will do).

Upvotes: 1

Related Questions