yuuki76
yuuki76

Reputation: 13

How can I get exactly the same results in OpenCV as with ImageMagick's "convert -fft"?

When I used ImageMagick-Q16, everything was solved. I was completely wrong.

To analyze images from anime shows., I would like to display the magnitude values obtained by the FFT.; ImageMagick's convert INPUT -fft OUTPUTworked well, but is unwieldy on Windows because it requires FFTW. Therefore, I would like to use OpenCV to get the exact same FFT in Python as ImageMagick. Thank you very much for your help.

convert INPUT -set colorspace Gray -separate -average lenagray.png 
convert lenagray.png -fft lenagfft.png
convert lenagfft-0.png -contrast-stretch 0 -evaluate log 10000 RESULT

I would like to display an image that looks identical to this code in OpenCV, and when I try it in Lena, it looks like this

Upvotes: 0

Views: 692

Answers (2)

fmw42
fmw42

Reputation: 53174

Here are the difference between Imagemagick's FFT (DFT) and OpenCV's DFT and how to make them equivalent.

In ImageMagick's command:

convert lena.png -colorspace gray -fft -delete 1 -evaluate log 10000 lena_fft_spec10000.png

The log is base 10. And Colorspace gray is

Gray = 0.298839*R+0.586811*G+0.114350*B

Furthermore, ImageMagick scales the image gray levels to the range 0 to 1 internally and then normalizes the Fourier transform in the forward direction (by dividing by the total number of pixels, N), so that the center pixel value of the dft is equivalent to the mean gray level of the (spatial domain) input image.

Using the above, I get:

enter image description here

In OpenCV/Numpy, the np.log is the natural log. So one needs to use np.log10 and make the equivalent normalized log function in the spectrum, which is

spectrum = np.log10(10000*mag+1)/np.log10(10000+1)

OpenCV's formula for converting grayscale is slightly different.

RGB[A] to Gray:Y←0.299⋅R+0.587⋅G+0.114⋅B 

which is likely different only in rounding precision.

The main issue is that in OpenCV, the normalization by 1/N (N=total pixels) is done in the inverse transform. This means that the center pixel of the magnitude will be the sum of the graylevels of the (spatial domain) input image. To make them equivalent, we needs to divide the input by 255 to get it to the range 0 to 1 and then divide by (256*256 = total number of pixels). So here is that modified code.

import numpy as np
import cv2

# read input as grayscale, convert to float and divide by 255
# opencv fft only works on grayscale
img = cv2.imread('lena.png', 0).astype(np.float32)/255

# convert image to floats and do dft normalizing by 1/N (N=256*256) saving as complex output
dft = cv2.dft(img/(256*256), flags = cv2.DFT_COMPLEX_OUTPUT)

# apply shift of origin from upper left corner to center of image
dft_shift = np.fft.fftshift(dft)

# extract magnitude and phase images
mag, phase = cv2.cartToPolar(dft_shift[:,:,0], dft_shift[:,:,1])
    
# get spectrum for viewing using normalized base 10 log 
spec = np.log10(10000*mag+1)/np.log10(10000+1)

# convert from float in range 0-1 to uint8 in range 0-255
spec = (255*spec).clip(0,255).astype(np.uint8)

# write result to disk
cv2.imwrite("lena_dft_spectrum_opencv3.png", spec)
 
cv2.imshow("ORIGINAL", img)
cv2.imshow("MAG", mag)
cv2.imshow("PHASE", phase)
cv2.imshow("SPECTRUM", spec)
cv2.waitKey(0)
cv2.destroyAllWindows()

This results in:

enter image description here

Visually, they look identical. If I use ImageMagick compare to see how close they are, I get:

compare -metric rmse lena_fft_spec10000.png lena_dft_spectrum_opencv2.png null:
386.182 (0.00589276)

This says that the rmse difference is 0.59%

Differences in floating point precision can account for such differences.

Upvotes: 0

fmw42
fmw42

Reputation: 53174

Here is how to get the spectrum (log of mangitude of the dft) of an image in Python/OpenCV.

Input:

enter image description here

import numpy as np
import cv2

# read input as grayscale
# opencv dft only works on grayscale
img = cv2.imread('lena.png', 0)

# convert image to floats and do dft saving as complex output
dft = cv2.dft(np.float32(img), flags = cv2.DFT_COMPLEX_OUTPUT)

# apply shift of origin from upper left corner to center of image
dft_shift = np.fft.fftshift(dft)

# extract magnitude and phase images
mag, phase = cv2.cartToPolar(dft_shift[:,:,0], dft_shift[:,:,1])

# get spectrum for viewing only
spec = np.log(mag) / 30

# convert from float in range 0-1 to uint8 in range 0-255
spec = (255*spec).clip(0,255).astype(np.uint8)

# write result to disk
cv2.imwrite("lena_dft_spectrum_opencv2.png", spec)
 
cv2.imshow("ORIGINAL", img)
cv2.imshow("MAG", mag)
cv2.imshow("PHASE", phase)
cv2.imshow("SPECTRUM", spec)
cv2.waitKey(0)
cv2.destroyAllWindows()

Resulting Spectrum Image:

enter image description here

Here is the spectrum from ImageMagick. Note that the log function is a normalized log. So it will not produce exactly the same result as above. The only difference is in scaling from the log functions.

convert lena.png -colorspace gray -fft -delete 1 -evaluate log 20000 lena_fft.png

enter image description here

Or one can get a closer comparison result as:

convert lena.png -colorspace gray -fft -delete 1 -evaluate log 20000000 -evaluate divide 2 lena_fft2.png

enter image description here

Upvotes: 1

Related Questions