kirbuchi
kirbuchi

Reputation: 2304

Fast 2D convolution implementation?

I've made a CUDA program for 2D convolution and now want to compare it to some non-CUDA implementation to measure the speedup.

I could compare to my own implementation in plain C using the classical multiple loop approach or matlab's conv2 but it doesn't feel like a legit/fair comparison, since they're not the fastest implementations out there.

Also I was thinking of trying OpenCV and I've been looking for a SIMD optimized version with no luck. Any advice, should I go with OpenCV?

NOTE: I've read other questions, including this one, but the answer is basically the same as my plain C code or a discussion of the various methods available.

Upvotes: 3

Views: 5943

Answers (1)

Pace
Pace

Reputation: 43937

The fastest general 2D convolution algorithm is going to perform the FFT on the source first, then correlate, then FFT back to get the result (which is what conv2 does in matlab) so your multiple loop approach probably isn't the best.

The GSL is going to give you a standard, and fast implementation of the FFT if you want to use that.

Also, if the kernel is separable you may be able to do the convolution as two 1D convolutions.

OpenCV is great if that works too, it should be widely accepted as a fast implementation.

Upvotes: 5

Related Questions