Differences in filter2D implementation

Question

I was trying to implement convolute2D (filter2D in OpenCV) and came up with the following code.

Mat convolute2D(Mat image, double** kernel, int W){
    Mat filtered_image = image.clone();
    // find center position of kernel (half of kernel size)
    int kCenterX = W / 2;
    int kCenterY = W / 2;
    int xx = 0;
    int yy = 0;
    cout << endl << "Performing convolution .." << endl;
    cout << "Image Size : " << image.rows << ", " << image.cols <= 0 && ii < image.rows && jj >= 0 && jj < image.cols) {
                        filtered_image.at(Point(j, i)) += image.at(Point(jj, ii)) * kernel[xx][yy];
                    }

                }
            }
        }
    }
    return filtered_image;
}

Assuming we always have a square kernel. But my results have been much different from filter2D. Is it because of possible overflow or is there a problem with my implementation?

Thanks

Cris Luengo · Accepted Answer

There are two issues with your code:

You don't set the output image to zero before adding values to it. Consequently, you are computing "input + filtered input", rather than just "filtered input".
Presuming that kernel has quite small values, "input pixel * kernel value" will likely yield a small number, which is rounded down when written to a uchar. Adding up each of these values for the kernel, you'll end up with a result that is too low.

I recommend that you do this:

double res = 0;
for(int x = 0; x < W; ++x){
   int xx = W - 1 - x;
   for(int y = 0; y < W; ++y){
      int yy = W - 1 - y;
      int ii = i + (x - kCenterX);
      int jj = j + (y - kCenterY);
      if( ii >= 0 && ii < image.rows && jj >= 0 && jj < image.cols) {
         res += image.at(Point(jj, ii)) * kernel[xx][yy];
      }
   }
}
filtered_image.at(Point(j, i)) = res;

This solves both issues at once. Also, this should be a bit faster because accessing the output image has a bit of overhead.

For much faster speeds, consider that the check for out-of-bounds reads (the if in the inner loop) slows down your code significantly, and is totally unnecessary for most pixels (as few pixels are near the image edge). Instead, you can split up your loops into [0,kCenterX], [kCenterX,image.rows-kCenterX], and [image.rows-kCenterX,image.rows]. The middle loop, which is typically by far the largest, will not need to check for out-of-bounds reads.

Differences in filter2D implementation

Answers (2)

Related Questions