opencvmathimage-processingremapbilinear-interpolation

Reputation: 18874

Inverting a real-valued index grid

OpenCV's remap() uses a real-valued index grid to sample a grid of values from an image using bilinear interpolation, and returns the grid of samples as a new image.

To be precise, let:

A = an image 
X = a grid of real-valued X coords into the image. 
Y = a grid of real-valued Y coords into the image.
B = remap(A, X, Y)

Then for all pixel coordinates i, j,

B[i, j] = A(X[i, j], Y[i, j])

Where the round-braces notation A(x, y) denotes using bilinear interpolation to solve for the pixel value of image A using float-valued coords x and y.

My question is: given an index grid X, Y, how can I generate an "inverse grid" X^-1, Y^-1 such that:

X(X^-1[i, j], Y^-1[i, j]) = i
Y(X^-1[i, j], Y^-1[i, j]) = j

And

X^-1(X[i, j], Y[i, j]) = i
Y^-1(X[i, j], Y[i, j]) = j

For all integer pixel coordinates i, j?

FWIW, the image and index maps X and Y are the same shape. However, there is no a priori structure to the index maps X and Y. For example, they're not necessarily affine or rigid transforms. They may even be uninvertible, e.g. if X, Y maps multiple pixels in A to the same exact pixel coordinate in B. I'm looking for ideas for a method that will find a reasonable inverse map if one exists.

The solution need not be OpenCV-based, as I'm not using OpenCV, but another library that has a remap() implementation. While any suggestions are welcome, I'm particularly keen on something that's "mathematically correct", i.e. if my map M is perfectly invertible, the method should find the perfect inverse, within some small margin of machine precision.

Upvotes: 29

Answers (15)

ManuGira

Reputation: 11

Barycentric Resampling Solution

I used a resampling approach with barycentric interpolation. I optimized it with numba. The result is fast and resilient to distortion, rotation, scaling, symmetry and zoom.

import math

import numba
import numpy as np
import numpy.typing as npt


@numba.njit()
def vertex_index_buffer(h: int, w: int) -> npt.NDArray:
    """
    Each quad formed by 4 points can be split up in 2 triangles.
    returns a 2D array of height=(h-1)*(w-1)*2 and width 3. Each row corresponds to a triangle
    """
    N = (h - 1) * (w - 1) * 2
    n = 0
    triangle_vib = np.empty((N, 3), dtype=np.int32)

    # for each quadritlateral
    for y in range(h - 1):
        for x in range(w - 1):
            # indexes of the 4 points
            ind0 = y * w + x
            ind1 = ind0 + 1
            ind2 = ind0 + w
            ind3 = ind2 + 1

            # fill 2 triangles
            triangle_vib[n, :] = ind0, ind1, ind2
            triangle_vib[n + 1, :] = ind2, ind1, ind3
            n += 2
    return triangle_vib


@numba.jit(nopython=True)
def invert_map(xmap: npt.NDArray, ymap: npt.NDArray) -> tuple[npt.NDArray, npt.NDArray]:
    h, w = xmap.shape
    xmap_inv = np.zeros_like(xmap) - 1
    ymap_inv = np.zeros_like(ymap) - 1

    triangle_vib = vertex_index_buffer(h, w)

    # for each triangle
    for k0, k1, k2 in triangle_vib:
        # get xy forrdinates of the triangles vertices
        x0 = xmap.ravel()[k0]
        x1 = xmap.ravel()[k1]
        x2 = xmap.ravel()[k2]

        y0 = ymap.ravel()[k0]
        y1 = ymap.ravel()[k1]
        y2 = ymap.ravel()[k2]

        # barycentric coordinates
        dy21 = y1 - y2
        dx20 = x0 - x2
        dx12 = x2 - x1
        dy20 = y0 - y2

        norm = dy21 * dx20 + dx12 * dy20

        i0 = k0 // w
        i1 = k1 // w
        i2 = k2 // w

        j0 = k0 % w
        j1 = k1 % w
        j2 = k2 % w

        # search area (rectangle surrounding current triangle)
        xmin = int(math.floor(min(x0, x1, x2)))
        ymin = int(math.floor(min(y0, y1, y2)))
        xmax = int(math.ceil(max(x0, x1, x2)))
        ymax = int(math.ceil(max(y0, y1, y2)))

        xmin = min(max(0, xmin), w - 1)
        ymin = min(max(0, ymin), h - 1)
        xmax = min(max(0, xmax), w - 1)
        ymax = min(max(0, ymax), h - 1)

        if abs(norm) <= 0.01:
            xmap_inv[ymin:ymax, xmin:xmax] = j0
            ymap_inv[ymin:ymax, xmin:xmax] = i0
            continue

        for px in range(xmin, xmax):
            pwx0 = dy21 * (px - x2)
            pwx1 = -dy20 * (px - x2)
            for py in range(ymin, ymax):
                # compute normalized weights of barycentric coordinates. Sum of weights must be 1
                w0 = (pwx0 + dx12 * (py - y2)) / norm
                w1 = (pwx1 + dx20 * (py - y2)) / norm
                w2 = 1 - w0 - w1

                # barycentric interpolation
                xmap_inv[py, px] = (j0 * w0 + j1 * w1 + j2 * w2)
                ymap_inv[py, px] = (i0 * w0 + i1 * w1 + i2 * w2)

    return xmap_inv, ymap_inv

I compared it with the Iterative solution proposed by Hannesh. The Iterative solution is faster than mine but it fails under rotation and symmetry.

Benchmark

I've created a repo invert_map for benchmarking the different algo I tried. Feel free to add yours.

Upvotes: 1

pbp

Reputation: 49

Solution https://stackoverflow.com/a/68706787/4521113 is great, but I was not satisfied with the provided explanation. Here I will contribute my interpretation on that solution, the assumptions I think it makes, and the limitations that arise from those assumptions.

Introductory problem

Assume we have a function f(x) and we want to obtain a value x that produces f(x) = y. As an example, assume f(x) = x² and we want to find the value x that produces f(x) = 4. For this concrete case, we can invert the function and use x = f⁻¹(y) = sqrt(y), so f⁻¹(4) = sqrt(4) = 2, which gives us the solution.

However, a function is not always invertible, or finding the inverse could be non-trivial. In such cases, we can redefine the problem as a minimization problem. Let's define the loss function C(x) = ( y - f(x) )² where y is the value we want to obtain after evaluating f(x). Finding x for which f(x) = y, is equivalent to minimizing C(x).

There are plenty of algorithms used to find the minimum of a function. Let's consider Gradient descent to solve this problem, just because. In our case, we would iterate on x to find the solution through

x_{k+1} = x_k - alpha * dC/dx = x_k + 2 alpha * ( y - f(x) ) * df/dx

Applying this to our particular example starting from x_0 = 1,

import math

alpha = 5.0e-2
x = 1
for i in range(100):
    x = x + 2 * alpha * ( 4.0 - x**2 ) * (2*x)
    print(x)

we observe how the value of x slowly approaches 2, which we know to be the solution to our minimization problem.

Gradient descent to invert a map from pixel to pixel

Maps X[i,j] and Y[i,j] can be thought as functions from R² to R, which combined produce a function F = ( X(i,j) , Y(i,j) ) from R² to R²; it maps pixel coordinates in the original image to pixel coordinates in the target image. Inverting the map is equivalent to find F⁻¹ = ( X⁻¹(i',j') , Y⁻¹(i',j') ) that maps pixel coordinates in the target image to pixel coordinates in the original image. And again, this problem can be reformulated as a minimization problem by defining the cost function C(i,j) = || (i',j') - ( X(i,j) , Y(i,j) ) ||². And again, we can iterate using gradient descent to find the coordinates (i,j) in the original image that are mapped to the coordinates (i',j') in the target image:

(i,j)_{k+1} = (i,j)_k + 2 alpha [ (i',j') - ( X(i,j) , Y(i,j) ) ] * J

where J is the Jacobian of the map:

J = [ dX(i,j)/di  dX(i,j)/dj ]
    [ dY(i,j)/di  dY(i,j)/dj ]

and we assume (i,j) and (i',j') to be 2d row vectors. This starts to resemble the solution mentioned above.

Assumptions

There are multiple variants of Gradient descent. Some of them use some "conjugate direction"; a direction different from the gradient, but that also leads to the minimum. The solution proposed by Hannesh substitutes the Jacobian J by the identity matrix. Hence, the assumption is that the identity matrix times a scale factor alpha' is a valid approximation for the Jacobian times 2 times the "learning rate" alpha: 2 * alpha * J is approximated by alpha' * I.

Finally, alpha' is chosen to be 1.

Introducing these changes in the iterative algorithms we obtain:

(i,j)_{k+1} = (i,j)_k + [ (i',j') - ( X(i,j) , Y(i,j) ) ]

Now, we can build an image of target indices (i',j'), and approximate the evaluation of ( X(i,j) , Y(i,j) ) by using the remap function. That would yield the final version of the proposed algorithm.

Practical considerations

I implemented the solution to approximate the inverse map of my camera calibration function. That camera calibration function takes coordinates in the original distorted image: and transforms them into coordinates in the undistorted image. Here you can see the result of applying the iterative algorithm pixel by pixel, and WITHOUT using the remap function but evaluating the current solution with the exact map:

The solution is an approximation

The result provided by the algorithm is an approximation, because the remap function provides an approximation to the evaluation of the map at the current solution. Here you can see the result of applying the algorithm using the remap function, a learning rate of alpha'=1e-2, and iterating 1000 times: Note the artifacts on the borders of the image, and the lack of mapping in the right bottom corner. The chosen interpolation method was selected using INTER_LINEAR, but using INTER_CUBIC does not really help either:

Choosing a too small or too big learning-rate

The inverse map computed using the exact map was obtained using a learning rate of alpha'=5e-1, and 50 iterations. However, choosing the learning rate inappropriately can also lead to artifacts in the final result.

Find here the result of using alpha'=1e0 and 50 iterations: Note the artifacts obtained in the corners of the image. That is the consequence of the algorithm not converging because of a too big learning rate.

On the other hand, check the result of using alpha'=1e-2 and 50 iterations: Note how the "undistorted" image is not totally undistorted, and "straight" lines are still curved. That is the consequence of the algorithm not converging because of a too small learning rate.

Upvotes: 0

Zhiqiang Zhou

Reputation: 39

Use KDTree and Inverse Distance Weighting (IDW)

Upvotes: -1

dufen

Reputation: 1

Well, to get the distort image from undistort, maybe you can use undistortPoints function of opencv to get reverse map. Use initUndistortRectifyMap you get map from distort->undistort, and use undistortPoints, you can get map from undistort->distort points by points, then use remap to get the distort image.

Upvotes: 0

Roulbacha

Reputation: 467

A KNNRegressor has all the necessary components to invert the grid mapping!

Here you go:

from sklearn.neighbors import KNeighborsRegressor

def get_inverse_maps(map1, map2):
    regressor = KNeighborsRegressor(3)
    X = np.concatenate((map2[..., None], map1[..., None]), axis=-1).reshape(-1, 2)
    y = np.indices(map1.shape).transpose((1, 2, 0)).reshape(-1, 2)
    regressor.fit(X, y)
    map_inv = regressor.predict(y).reshape(map1.shape + (2,)).astype(np.float32)
    map_inv2, map_inv1 = map_inv[..., 0], map_inv[..., 1]
    return map_inv1, map_inv2

Upvotes: 1

Zgyz McFitzgyzson

Reputation: 9

One way to do it is to take the original map, iterate through its entries and take floors and ceils of the x and y values. This gives the four nearest integers around (x,y), (x_f,y_f), (x_c,y_f), (x_f,y_c), and (x_c,y_c) in the coordinates of the original source image. You can then fill in a structure with each of these as an index which contains the pixel value and a weight, and use your preferred irregular grid interpolation with those data.

This is easy to implement with inverse distance interpolation, since the structure can be an image array accumulation and the weights are scalars. F is the original source, G is the warped image, and F' is the restored image. The map is M.

Init F' to 0. Create a 0-initialized weight array W of floats the same size as F'.

Iterate through M. For each in M, find the 4 integer pairs and their distances from (x,y). Take the corresponding pixel value from G, weight it by its reciprocal distance, and accumulate it into F' like

F'(xf|c,yf|c)+=G(i,j)/sqrt((x-xf|c)^2+(y-yf|c)^2)

Then accumulate that weight into

W(xf|c,yf|c)+=1./sqrt((x-xf|c)^2+(y-yf|c)^2).

After that is finished, normalize F' by iterating through it and divide each pixel by its corresponding entry in W, if it's non zero.

At this point, the image is usually nearly complete, but with high downsampling ratios, some pixels in F' may not get filled in. So then you do a couple passes back and forth through W to find 0 weight entries, and interpolate those pixels from their non-empty neighbors. This part could be done with KNN search and interpolate too since there usually aren't many of them.

It's easy to implement and scales a lot better than the KNN approach (though I think that's great for small images). The downside is that inverse distance isn't the greatest interpolation scheme, but it seems to work fairly well if the mapping isn't too clumpy and the original hasn't been downsampled a lot. Of course, if the downsample ratio is high, you're having to infer a lot of lost information, so it's inherently going to give rough results.

If you want to squeeze as much as possible out of the map inversion, you could try to solve the (potentially underdetermined) system of equations defined by the original interpolation scheme; not impossible, but challenging.

Upvotes: 0

Hannesh

Reputation: 7488

Iterative solution

Many of the above solutions didn't work for me, failed when the map wasn't invertible, or weren't terribly fast.

I present an alternative, 6-line iterative solution.

def invert_map(F):
    I = np.zeros_like(F)
    I[:,:,1], I[:,:,0] = np.indices(sh)
    P = np.copy(I)
    for i in range(10):
        P += I - cv.remap(F, P, None, interpolation=cv.INTER_LINEAR)
    return P

How well does it do? For my use case of inverting a terrain correction map for aerial photography, this method converges comfortably in 10 steps to 1/10th of a pixel. It's also blazingly fast, because all the heavy compute is tucked inside OpenCV

How does it work?

The approach uses the idea that if (x', y') = F(x, y) is a mapping, then the inverse can be approximated with (x, y) = -F(x', y'), as long as the gradient of F is small.

We can continue to refine our mapping, the above gets us our first prediction (I is an "identity mapping"):

G_1 = I - F

Our second prediction can be adapted from that:

G_2 = G_1 + I - F(G_1)

and so on:

G_n+1 = G_n + I - F(G_n)

Proving that G_n converges to the inverse F^-1 is hard, but what we can easily prove is that if G has converged, it will stay converged.

Assume G_n = F^-1, then we can substitute into:

G_n+1 = G_n + I - F(G_n)

and then get:

G_n+1 = F^-1 + I - F(F^-1)
G_n+1 = F^-1 + I - I
G_n+1 = F^-1
Q.E.D.

Testing script

import cv2 as cv
from scipy import ndimage as ndi
import numpy as np
from matplotlib import pyplot as plt

# Simulate deformation field
N = 500
sh = (N, N)
t = np.random.normal(size=sh)
dx = ndi.gaussian_filter(t, 40, order=(0,1))
dy = ndi.gaussian_filter(t, 40, order=(1,0))
dx *= 10/dx.max()
dy *= 10/dy.max()

# Test image
img = np.zeros(sh)
img[::10, :] = 1
img[:, ::10] = 1
img = ndi.gaussian_filter(img, 0.5)

# Apply forward mapping
yy, xx = np.indices(sh)
xmap = (xx-dx).astype(np.float32)
ymap = (yy-dy).astype(np.float32)
warped = cv.remap(img, xmap, ymap ,cv.INTER_LINEAR)
plt.imshow(warped, cmap='gray')

def invert_map(F: np.ndarray):
    I = np.zeros_like(F)
    I[:,:,1], I[:,:,0] = np.indices(sh)
    P = np.copy(I)
    for i in range(10):
        P += I - cv.remap(F, P, None, interpolation=cv.INTER_LINEAR)
    return P

# F: The function to invert
F = np.zeros((sh[0], sh[1], 2), dtype=np.float32)
F[:,:,0], F[:,:,1] = (xmap, ymap)

# Test the prediction
unwarped = cv.remap(warped, invert_map(F), None, cv.INTER_LINEAR)
plt.imshow(unwarped, cmap='gray')

Upvotes: 11

pthibault

Reputation: 504

This is an important problem, and I am surprised that it is not better addressed in any standard library (at least to my knowledge).

I wasn't happy with the accepted solution as it didn't use the implicit smoothness of the transformation. I might miss important cases, but I cannot imagine mapping that are both invertible in any useful sense and non-smooth at the pixel scale.

Smoothness means that there is no need to compute nearest neighbors: the nearest points are those that are already near on the original grid.

My solution uses the fact that, in the original mapping, a square [(i,j), (i+1, j), (i+1, j+1), (i, j+1)] maps to a quadrilateral [(X[i,j], Y[i,j], X[i+1,j], Y[i+1,j], ...] that has no other points inside. Then the inverse mapping only requires interpolation within the quadrilateral. For this I use an inverse bilinear interpolation, which will give exact results at the vertices and for any other affine transform.

The implementation has no other dependency than numpy. The logic is to run through all quadrilaterals and build progressively the reverse mapping. I copy the code here, hopefully there are enough comments to make the idea clear enough.

A few comments on the less obvious stuff:

The inverse bilinear function would normally return coordinates only in the range [0,1]. I removed the clipping operation, so that out-of-range values mean that the coordinate is outside of the quadrilateral (that's a contorted way of solving the point-in-polygon problem!). To avoid missing points on the edges, I actually allow for points out of the [0,1] range, which normally means that an index may be picked up by two neighboring quadrilaterals. In these rare cases I just let the result be the average of the two result, trusting that the out-of-range points are "extrapolating" in a reasonable way.
In general all quadrilaterals have a different shape, and their overlap with the regular grid can go from nothing at all to vary many points. The routine solves all quadrilateral at once (to exploit the vectorised nature of bilinear_inverse, but at each iteration selects only the quadrilaterals for which the coordinates (offset to their bounding box) are valid.

import numpy as np

def bilinear_inverse(p, vertices, numiter=4):
    """
    Compute the inverse of the bilinear map from the unit square
    [(0,0), (1,0), (1,1), (0,1)]
    to the quadrilateral vertices = [p0, p1, p2, p4]

    Parameters:
    ----------
    p: array of shape (2, ...)
        Points on which the inverse transforms are applied.
    vertices: array of shape (4, 2, ...)
        Coordinates of the vertices mapped to the unit square corners
    numiter:
        Number of Newton interations

    Returns:
    --------
    s: array of shape (2, ...)
        Mapped points.

    This is a (more general) python implementation of the matlab implementation 
    suggested in https://stackoverflow.com/a/18332009/1560876
    """

    p = np.asarray(p)
    v = np.asarray(vertices)
    sh = p.shape[1:]
    if v.ndim == 2:
        v = np.expand_dims(v, axis=tuple(range(2, 2 + len(sh))))

    # Start in the center
    s = .5 * np.ones((2,) + sh)
    s0, s1 = s
    for k in range(numiter):
        # Residual
        r = v[0] * (1 - s0) * (1 - s1) + v[1] * s0 * (1 - s1) + v[2] * s0 * s1 + v[3] * (1 - s0) * s1 - p

        # Jacobian
        J11 = -v[0, 0] * (1 - s1) + v[1, 0] * (1 - s1) + v[2, 0] * s1 - v[3, 0] * s1
        J21 = -v[0, 1] * (1 - s1) + v[1, 1] * (1 - s1) + v[2, 1] * s1 - v[3, 1] * s1
        J12 = -v[0, 0] * (1 - s0) - v[1, 0] * s0 + v[2, 0] * s0 + v[3, 0] * (1 - s0)
        J22 = -v[0, 1] * (1 - s0) - v[1, 1] * s0 + v[2, 1] * s0 + v[3, 1] * (1 - s0)

        inv_detJ = 1. / (J11 * J22 - J12 * J21)

        s0 -= inv_detJ * (J22 * r[0] - J12 * r[1])
        s1 -= inv_detJ * (-J21 * r[0] + J11 * r[1])

    return s


def invert_map(xmap, ymap, diagnostics=False):
    """
    Generate the inverse of deformation map defined by (xmap, ymap) using inverse bilinear interpolation.
    """

    # Generate quadrilaterals from mapped grid points.
    quads = np.array([[ymap[:-1, :-1], xmap[:-1, :-1]],
                      [ymap[1:, :-1], xmap[1:, :-1]],
                      [ymap[1:, 1:], xmap[1:, 1:]],
                      [ymap[:-1, 1:], xmap[:-1, 1:]]])

    # Range of indices possibly within each quadrilateral
    x0 = np.floor(quads[:, 1, ...].min(axis=0)).astype(int)
    x1 = np.ceil(quads[:, 1, ...].max(axis=0)).astype(int)
    y0 = np.floor(quads[:, 0, ...].min(axis=0)).astype(int)
    y1 = np.ceil(quads[:, 0, ...].max(axis=0)).astype(int)

    # Quad indices
    i0, j0 = np.indices(x0.shape)

    # Offset of destination map
    x0_offset = x0.min()
    y0_offset = y0.min()

    # Index range in x and y (per quad)
    xN = x1 - x0 + 1
    yN = y1 - y0 + 1

    # Shape of destination array
    sh_dest = (1 + x1.max() - x0_offset, 1 + y1.max() - y0_offset)

    # Coordinates of destination array
    yy_dest, xx_dest = np.indices(sh_dest)

    xmap1 = np.zeros(sh_dest)
    ymap1 = np.zeros(sh_dest)
    TN = np.zeros(sh_dest, dtype=int)

    # Smallish number to avoid missing point lying on edges
    epsilon = .01

    # Loop through indices possibly within quads
    for ix in range(xN.max()):
        for iy in range(yN.max()):
            # Work only with quads whose bounding box contain indices
            valid = (xN > ix) * (yN > iy)

            # Local points to check
            p = np.array([y0[valid] + ix, x0[valid] + iy])

            # Map the position of the point in the quad
            s = bilinear_inverse(p, quads[:, :, valid])

            # s out of unit square means p out of quad
            # Keep some epsilon around to avoid missing edges
            in_quad = np.all((s > -epsilon) * (s < (1 + epsilon)), axis=0)

            # Add found indices
            ii = p[0, in_quad] - y0_offset
            jj = p[1, in_quad] - x0_offset

            ymap1[ii, jj] += i0[valid][in_quad] + s[0][in_quad]
            xmap1[ii, jj] += j0[valid][in_quad] + s[1][in_quad]

            # Increment count
            TN[ii, jj] += 1

    ymap1 /= TN + (TN == 0)
    xmap1 /= TN + (TN == 0)

    if diagnostics:
        diag = {'x_offset': x0_offset,
                'y_offset': y0_offset,
                'mask': TN > 0}
        return xmap1, ymap1, diag
    else:
        return xmap1, ymap1

Here's a test example

import cv2 as cv
from scipy import ndimage as ndi

# Simulate deformation field
N = 500
sh = (N, N)
t = np.random.normal(size=sh)
dx = ndi.gaussian_filter(t, 40, order=(0,1))
dy = ndi.gaussian_filter(t, 40, order=(1,0))
dx *= 30/dx.max()
dy *= 30/dy.max()

# Test image
img = np.zeros(sh)
img[::10, :] = 1
img[:, ::10] = 1
img = ndi.gaussian_filter(img, 0.5)

# Apply forward mapping
yy, xx = np.indices(sh)
xmap = (xx-dx).astype(np.float32)
ymap = (yy-dy).astype(np.float32)
warped = cv.remap(img, xmap, ymap ,cv.INTER_LINEAR)
plt.imshow(warped, cmap='gray')

# Now invert the mapping
xmap1, ymap1 = invert_map(xmap, ymap)

unwarped = cv.remap(warped, xmap1.astype(np.float32), ymap1.astype(np.float32) ,cv.INTER_LINEAR)

plt.imshow(unwarped, cmap='gray')

Upvotes: 7

SCaffrey

Reputation: 341

Here's an implementation of @wcochran 's answer. I was trying to recover a lens correction resulted by lensfunpy.

mod = lensfunpy.Modifier(lens, cam.crop_factor, width, height)
mod.initialize(focal_length, aperture, distance)

undist_coords = mod.apply_geometry_distortion()

## the lens correction part
# im_undistorted = cv2.remap(im, undist_coords, None, cv2.INTER_CUBIC)

# im_undistorted = cv2.remap(im, undist_coords, None, cv2.INTER_LANCZOS4)
# cv2.imwrite(undistorted_image_path, im_undistorted)
undist_coords_f = undist_coords.reshape((-1, 2))
tree = KDTree(undist_coords_f)
def calc_val(point_pos):
    nearest_dist, nearest_ind = tree.query([point_pos], k=5)
    if nearest_dist[0][0] == 0:
        return undist_coords_f[nearest_ind[0][0]]
    # starts inverse distance weighting
    w = np.array([1.0 / pow(d, 2) for d in nearest_dist])
    sw = np.sum(w)
    # embed()
    x_arr = np.floor(nearest_ind[0] / 1080)
    y_arr = (nearest_ind[0] % 1080)
    xx = np.sum(w * x_arr) / sw
    yy = np.sum(w * y_arr) / sw
    return (xx, yy)

un_correction_x = np.zeros((720, 1080))
un_correction_y = np.zeros((720, 1080))

## reverse the lens correction
for i in range(720):
    print("row %d operating" % i)
    for j in range(1080):
        un_correction_x[i][j], un_correction_y[i][j] = calc_val((i, j))
        # print((i, j), calc_val((j, i)))

dstMap1, dstMap2 = cv2.convertMaps(un_correction_x.astype(np.float32), un_correction_y.astype(np.float32), cv2.CV_32FC2)
im_un_undistorted = cv2.remap(im_undistorted, dstMap1, dstMap2, cv2.INTER_LANCZOS4)

Upvotes: 2

FelisRattus

Reputation: 41

You can invert map at known points and interpolate it into new grid. It will work fine, while distortion is not very huge.

Here is very simple implementation in Python using scipy.interpolate.griddata:

map_x, map_y = cv2.initUndistortRectifyMap(K, D, None, new_K, image_size, cv2.CV_32FC1)

points =  np.stack([map_x.flatten(), map_y.flatten()], axis=1)
grid = np.mgrid[:map_x.shape[0], :map_y.shape[1]]
values = grid.reshape(2, -1).T[..., ::-1] 

from scipy.interpolate import griddata
grid_y, grid_x = grid
map_back = griddata(points, values, (grid_x, grid_y), method='cubic').astype(map_undistort.dtype)

If you use CV_32FC2 for maps, you can simplify points construction:

map_undistort, _ = cv2.initUndistortRectifyMap(K, D, None, new_K, image_size, cv2.CV_32FC2)
points = map_undistort.reshape(-1, 2)

Upvotes: 4

Tobias

Reputation: 229

If you map is derived from a homography H you could invert H and directly create the inverse maps with cv::initUndistortRectifyMap().

e.g. in Python:

import numpy as np.
map_size = () # fill in your map size
H_inv = np.linalg.inv(H)
map1, map2 = cv2.initUndistortRectifyMap(cameraMatrix=np.eye(3), distCoeffs=np.zeros(5), R=H_inv, newCameraMatrix=np.eye(3), size=map_size, m1type=cv2.CV_32FC1)

The OpenCV documentation states about initUndistortRectifyMap():

The function actually builds the maps for the inverse mapping algorithm that is used by remap(). That is, for each pixel (u, v) in the destination image, the function computes the corresponding coordinates in the source image.

In the case you have just given the maps, you have to do it by yourself. Hoewever, interpolation of the new maps' coordinates is not trivial, because the support region for one pixel could be very large.

Here is a simple Python solution which inverts the maps by doing point-to-point mapping. This will probably leave some coordinates unassigned, while others will be updated several times. So there may be holes in the map.

Here is a small Python program demonstrating both approaches:

import cv2
import numpy as np


def invert_maps(map_x, map_y):
    assert(map_x.shape == map_y.shape)
    rows = map_x.shape[0]
    cols = map_x.shape[1]
    m_x = np.ones(map_x.shape, dtype=map_x.dtype) * -1
    m_y = np.ones(map_y.shape, dtype=map_y.dtype) * -1
    for i in range(rows):
        for j in range(cols):
            i_ = round(map_y[i, j])
            j_ = round(map_x[i, j])
            if 0 <= i_ < rows and 0 <= j_ < cols:
                m_x[i_, j_] = j
                m_y[i_, j_] = i
    return m_x, m_y


def main():
    img = cv2.imread("pigeon.png", cv2.IMREAD_GRAYSCALE)

    # a simply rotation by 45 degrees
    H = np.array([np.sin(np.pi/4), -np.cos(np.pi/4), 0, np.cos(np.pi/4), np.sin(np.pi/4), 0, 0, 0, 1]).reshape((3,3))
    H_inv = np.linalg.inv(H)
    map_size = (img.shape[1], img.shape[0])

    map1, map2 = cv2.initUndistortRectifyMap(cameraMatrix=np.eye(3), distCoeffs=np.zeros(5), R=H, newCameraMatrix=np.eye(3), size=map_size, m1type=cv2.CV_32FC1)
    map1_inv, map2_inv = cv2.initUndistortRectifyMap(cameraMatrix=np.eye(3), distCoeffs=np.zeros(5), R=H_inv, newCameraMatrix=np.eye(3), size=map_size, m1type=cv2.CV_32FC1)
    map1_simple_inv, map2_simple_inv = invert_maps(map1, map2)

    img1 = cv2.remap(src=img, map1=map1, map2=map2, interpolation=cv2.INTER_LINEAR)
    img2 = cv2.remap(src=img1, map1=map1_inv, map2=map2_inv, interpolation=cv2.INTER_LINEAR)
    img3 = cv2.remap(src=img1, map1=map1_simple_inv, map2=map2_simple_inv,
                               interpolation=cv2.INTER_LINEAR)

    cv2.imshow("Original image", img)
    cv2.imshow("Mapped image", img1)
    cv2.imshow("Mapping forth and back with H_inv", img2)
    cv2.imshow("Mapping forth and back with invert_maps()", img3)
    cv2.waitKey(0)


if __name__ == '__main__':
    main()

Upvotes: 2

wcochran

Reputation: 10896

Well I just had to solve this remap inversion problem myself and I'll outline my solution.

Given X, Y for the remap() function that does the following:

B[i, j] = A(X[i, j], Y[i, j])

I computed Xinv, Yinv that can be used by the remap() function to invert the process:

A[x, y] = B(Xinv[x,y],Yinv[x,y])

First I build a KD-Tree for the 2D point set {(X[i,j],Y[i,j]} so I can efficiently find the N nearest neighbors to a given point (x,y). I use Euclidian distance for my distance metric. I found a great C++ header lib for KD-Trees on GitHub.

Then I loop thru all the (x,y) values in A's grid and find the N = 5 nearest neighbors {(X[i_k,j_k],Y[i_k,j_k]) | k = 0 .. N-1} in my point set.

If distance d_k == 0 for some k then Xinv[x,y] = i_k and Yinv[x,y] = j_k, otherwise...
Use Inverse Distance Weighting (IDW) to compute an interpolated value:
- let weight w_k = 1 / pow(d_k, p) (I use p = 2)
- Xinv[x,y] = (sum_k w_k * i_k)/(sum_k w_k)
- Yinv[x,y] = (sum_k w_k * j_k)/(sum_k w_k)

Note that if B is a W x H image then X and Y are W x H arrays of floats. If A is a w x h image then Xinv and Yinv are w x h arrays for floats. It is important that you are consistent with image and map sizing.

Works like a charm! My first version I tried brute forcing the search and I never even waited for it to finish. I switched to a KD-Tree then I started to get reasonable run times. I f I ever get time I would like to add this to OpenCV.

The second image below is use remap() to remove the lens distortion from the first image. The third image is a result of inverting the process.

Upvotes: 10

SuperElectric

Reputation: 18874

OP here. I think I've found an answer. I haven't implemented it yet, and if someone comes up with a less fiddly solution (or finds something wrong with this one), I'll choose their answer instead.

Problem statement

Let A be the source image, B be the destination image, and M be the mapping from A's coords to B's coords, i.e.:

B[k, l, :] == A(M[k, l, 0], M[k, l, 1], :) 
for all k, l in B's coords.

...where square braces indicate array lookup with integer indices, and circular braces indicate bilinear interpolation lookup with floating-point indices. We restate the above using the more economical notation:

B = A(M)

We wish to find an inverse mapping N that maps B back to A as best as is possible:

Find N s.t. A \approx B(N)

The problem can be stated without reference to A or B:

Find N = argmin_N || M(N) - I_n ||

...where ||*|| indicates the Frobenius norm, and I_n is the identity map with the same dimensions as N, i.e. a map where:

I_n[i, j, :] == [i, j]
for all i, j

Naive solution

If M's values are all integers, and M is an isomorphism, then you can construct N directly as:

N[M[k, l, 0], M[k, l, 1], :] = [k, l]
for all k, l

Or in our simplified notation:

N[M] = I_m

...where I_m is the identity map with the same dimensions as M.

There are two problems:

M is not an isomorphism, so the above will leave "holes" in N at N[i, j, :] for any [i, j] not among the values in M.
M's values are floating-point coordinates [i, j], not integer coordinates. We cannot simply assign a value to the bilinearly-interpolated quantity N(i, j, :), for float-valued i, j. To achieve the equivalent effect, we must instead set the values of [i, j]'s four surrounding corners N[floor(i), floor(j), :], N[floor(i), ceil(j), :], N[ceil(i), floor(j), :], N[ceil(i), ceil(j), :] such that the interpolated value N(i, j, :) equals the desired value [k, l], for all pixel mappings [i, j] --> [k, l] in M.

Solution

Construct empty N as a 3D tensor of floats:

N = zeros(size=(A.shape[0], A.shape[1], 2))

For each coordinate [i, j] in A's coordinate space, do:

Find the 2x2 grid of A-coordinates in M that [i, j] lies within. Compute the homography matrix H that maps those A-coordinates to their corresponding B-coordinates (given by the 2x2 grid's pixel indices).
Set N[i, j, :] = matmul(H, [i, j])

The potentially expensive step here would be the search in step 1 for the 2x2 grid of A-coordinates in M that encircles [i, j]. A brute-force search would make this whole algorithm O(n*m) where n is the number of pixels in A, and m the number of pixels in B.

To reduce this to O(n), one could instead run a scanline algorithm within each A-coordinate quadrilateral to identify all the integer-valued coordinates [i, j] it contains. This could be precomputed as a hashmap that maps integer-valued A coords [i, j] to the upper-left corner of its encircling quadrilateral's B coords [k, l].

Upvotes: 0

Malcolm McLean

Reputation: 6404

From what I understand you have an original image, and a transformed image, and you wish to recover the nature of the transform that has been applied without knowing it, but assuming it is something sensible, like a rotation or a fish-eye distort.

What I would try is thresholding the image to convert it to binary, in both the index image and the plain image. Then try to identify objects. Most mappings will at least retain connectivity and Euler number, mostly the largest object in the index will still be the largest object in the plain.

Then take moments for your matched image / indexed pairs and see if you can remove translation, rotation and scaling. That gives you several reverse maps, which you can then try to stitch together. (Hard if the transform is not simple, but the general problem of reconstituting just any transformation cannot be solved).

Upvotes: -1

avtomaton

Reputation: 4874

There is no any standard way to do it with OpenCV.

If you are looking for a complete ready-to-use solution, I am not sure that I can help, but I can at least describe a method that I used some years ago to do this task.

First of all, you should create remapping maps with the same dimension as your source image. I created maps with larger dimensions for simpler interpolation, and at final step cropped them to proper size. Then you should fill them with values existing in previous remapping maps (not so difficult: just iterate over them and if maps coordinates x and y lays in limits of your image, take their row and column as new y and x, and place into old x and y column and row of the new map). It is rather simple solution,but it gives rather good result. For perfect one you should interpolate old x and y to integer values using your interpolation method and neighbour pixels.

After this you should either actually remap pixel colors manually, or completely fill your remapping map with pixel coordinates and use version from OpenCV.

You will meet rather challenging task: you should interpolate pixels in empty areas. In other words, you should take distances to closest non-zero pixel coordinates and mix color (if you remap colors) or coordinates (if you proceed with full maps computation) fractions according to these distances. Actually it is also not so difficult for linear interpolation, and you can even look into remap() implementation in OpenCV github page. For NN interpolation it will me much simpler - just take color/coordinate of nearest neighbour.

And a final task is extrapolation of areas out of borders of remapped pixels area. Also algorithm from OpenCV can be used as a reference.

Upvotes: 0

Inverting a real-valued index grid

Answers (15)

Barycentric Resampling Solution

Benchmark

Introductory problem

Gradient descent to invert a map from pixel to pixel

Assumptions

Practical considerations

The solution is an approximation

Choosing a too small or too big learning-rate

Iterative solution

Problem statement

Naive solution

Solution

Related Questions