Mickey Shine
Mickey Shine

Reputation: 12559

How to transform filter when using FFT to do 2d convolution?

I want to use FFT to accelerate 2D convolution. The filter is 15 x 15 and the image is 300 x 300. The filter's size is different with image so I can not doing dot product after FFT. So how to transform the filter before doing FFT so that its size can be matched with image?

Upvotes: 0

Views: 2394

Answers (2)

FiReTiTi
FiReTiTi

Reputation: 5898

it depends on the algorithm you use for the FFT, because most of them need to work with images of dyadic dimensions (power of 2).

Here is what you have to do:

  1. Padding image: center your image into a bigger one with dyadic dimensions
  2. Padding kernel: center you convolution kernel into an image with same dimensions as step 1.
  3. FFT on the image from step 1
  4. FFT on the kernel from step 2
  5. Complex multiplication (Fourier space) of results from steps 3 and 4.
  6. Inverse FFT on the resulting image on step 5
  7. Unpadding on the resulting image from step 6
  8. Put all 4 blocs into the right order.

If the algorithm you use does not need dyadic dimensions, then steps 1 is useless and 2 has to be a simple padding with the image dimensions.

Upvotes: 3

Jiby
Jiby

Reputation: 1885

I use the convention that N is kernel size.

Knowing the convolution is not defined (mathematically) on the edges (N//2 at each end of each dimension), you would loose N pixels in totals on each axis.

You need to make room for convolution : pad the image with enough "neutral values" so that the edge cases (junk values inserted there) disappear.

This would involve making your image a 307x307px image (with suitable padding values, see next paragraph), which after convolution gives back a 300x300 image.

Popular image processing libraries have this already embedded : when you ask for a convolution, you have extra arguments specifying the "mode".


Which values can we pad with ?

Stolen with no shame from Numpy's pad documentation

  • 'constant' : Pads with a constant value.
  • 'edge' : Pads with the edge values of array.
  • 'linear_ramp' : Pads with the linear ramp between end_value and the arraydge value.
  • 'maximum' : Pads with the maximum value of all or part of the vector along each axis.
  • 'mean' Pads with the mean value of all or part of the vector along each axis.
  • 'median' Pads with the median value of all or part of the vector along each axis.
  • 'minimum' Pads with the minimum value of all or part of the vector along each axis.
  • 'reflect' Pads with the reflection of the vector mirrored on the first and last values of the vector along each axis.
  • 'symmetric' Pads with the reflection of the vector mirrored along the edge of the array.
  • 'wrap' Pads with the wrap of the vector along the axis. The first values are used to pad the end and the end values are used to pad the beginning.

It's up to you, really, but the rule of thumb is "choose neutral values for the task at hand".

(For instance, padding with 0 when doing averaging makes little sense, because 0 is not neutral in an average of positive values)

Upvotes: 2

Related Questions