Clex
Clex

Reputation: 63

Image Alignment of Multispectral Images Fails with ECC

I am trying to align an RGB image with an IR image (single channel). The goal is to create a 4 channel image R,G,B,IR. In order to do this, I am using cv2.findTransformECC as described in this very neat guide. The code is unchanged for now, except for line 13 where the Motion is set to Euclidian because I want to handle rotations in the future. I am using Python.

In order to verify the workings of the software, I used the images from the guide. It worked well so I wanted to correlate satellite images from multiple spectra as described above. Unfortunately, I ran into problems here.

Sometimes the algorithm converged (after ages) and sometimes it immediately crashed because it cant converge and other times it "finds" a solution that is clearly wrong. Attached you find two images that, from a human perspective, are easy to match, but the algorithm fails. The images are not rotated in any way, they are just not the exact same image (check the borders), so a translational motion is expected. Images are of Lake Neusiedlersee in Austria, the source is Sentinelhub.

Edit: With "sometimes" I refer to using different images from Sentinel. One pair of images has consistently the same outcome.

"Neusieldersee" in NIR, Data from Sentinel "Neusieldersee" in RGB, Data from Sentinel

I know that ECC is not feature-based which might pose a problem here.

I have also read that it is somewhat dependent on the initial warp matrix.

My questions are:

  1. Am I using cv2.findTransformECC wrong?
  2. Is there a better way to do this?
  3. Should I try to "Monte-Carlo" the initial matrices until it converges? (This feels wrong)
  4. Do you suggest using a feature-based algorithm?
  5. If so, is there one available or would I have to implement this myself?

Thanks for the help!

Upvotes: 3

Views: 1669

Answers (1)

Burak
Burak

Reputation: 2495

Do you suggest using a feature-based algorithm?

Sure. There are many feature detections algorithms. I generally choose SIFT because it provides good matching results and the runtime is feasibly fast.

import cv2 as cv
import numpy as np

# read the images
ir = cv.imread('ir.jpg', cv.IMREAD_GRAYSCALE)
rgb = cv.imread('rgb.jpg', cv.IMREAD_COLOR)

descriptor = cv.SIFT.create()
matcher = cv.FlannBasedMatcher()

# get features from images
kps_ir, desc_ir = descriptor.detectAndCompute(ir, mask=None)
gray = cv.cvtColor(rgb, cv.COLOR_BGR2GRAY)
kps_color, desc_color = descriptor.detectAndCompute(gray, mask=None)

# find the corresponding point pairs
if (desc_ir is not None and desc_color is not None and len(desc_ir) >=2 and len(desc_color) >= 2):
    rawMatch = matcher.knnMatch(desc_color, desc_ir, k=2)
matches = []
# ensure the distance is within a certain ratio of each other (i.e. Lowe's ratio test)
ratio = 0.75
for m in rawMatch:
    if len(m) == 2 and m[0].distance < m[1].distance * ratio:
        matches.append((m[0].trainIdx, m[0].queryIdx))

# convert keypoints to points
pts_ir, pts_color = [], []
for id_ir, id_color in matches:
    pts_ir.append(kps_ir[id_ir].pt)
    pts_color.append(kps_color[id_color].pt)
pts_ir = np.array(pts_ir, dtype=np.float32)
pts_color = np.array(pts_color, dtype=np.float32)

# compute homography
if len(matches) > 4:
    H, status = cv.findHomography(pts_ir, pts_color, cv.RANSAC)

warped = cv.warpPerspective(ir, H, (rgb.shape[1], rgb.shape[0]))
warped = cv.cvtColor(warped, cv.COLOR_GRAY2BGR)

# visualize the result
winname = 'result'
cv.namedWindow(winname, cv.WINDOW_KEEPRATIO)
alpha = 5
# res = cv.addWeighted(rgb, 0.5, warped, 0.5, 0)
res = None
def onChange(alpha):
    global rgb, warped, res, winname
    res = cv.addWeighted(rgb, alpha/10, warped, 1 - alpha/10, 0)
    cv.imshow(winname, res)
onChange(alpha)
cv.createTrackbar('alpha', winname, alpha, 10, onChange)
cv.imshow(winname, res)
cv.waitKey()
cv.destroyWindow(winname)

Result (alpha=8)

result with alpha=8

Edit: It seems like SIFT is not the best option as it fails for some other examples. Example images are in another question.

In this case, I suggest using SURF. It is a patented algorithm, so it does not come with the latest OpenCV PIP installations. You can install previous versions of OpenCV or build it from source.

descriptor = cv.xfeatures2d.SURF_create()

Result (alpha=8)

SURF homography result

Edit2: It is now clear that the key to achieve this task is to choose the correct feature descriptor. As a final note, I suggest choosing the appropriate motion model. Affine transform fits better than homography in this case.

H, _ = cv.estimateAffine2D(pts_ir, pts_color)
H = np.vstack((H, [0, 0, 1]))

Affine transform result:

SURF affine result

Upvotes: 1

Related Questions