Reputation: 173
As you know, Ptr<Filter> cv::cuda::createMedianFilter (int srcType, int windowSize, int partition=128)
function added to OpenCV3.1.0.
I'm trying to do a median filter on 8 bit large images (6000*6000) with custom window size(up to 21). I compare cv::medianBlur
and cv::cuda::createMedianFilter
and results was
windowSize cv::medianBlur cv::cuda::createMedianFilter
3 0.071 sec 3.637 sec
5 0.285 sec 3.679 sec
11 2.641 sec 3.652 sec
19 2.566 sec 3.719 sec
1) why cuda::createMedianFilter is slower than cv::medianBlur?
2) How can i write a kernel code to implement median filter that use opencv Mat with custom kernel size?
Upvotes: 1
Views: 2659
Reputation: 11
I also used cuda::createMedianFilter()
and found that, there are two GpuMat newly allocated in MedianFilter::apply()
everytime calling filter->apply()
, and GPU memory allocation is very time consuming, so I move the two Mats into MedianFilter Class to be member vars(do not allocated again unless the images size changes).
Speed up 4X tested with 1000 images (400 * 300). Also, it seems like the parameter partitions could be set to src.rows / 2, which will be faster than the original parameter-128.
The two mat in src code are GpuMat devHist; GpuMat devCoarseHist
Upvotes: 1
Reputation: 9779
The speed of the the convolution operation mainly depends on the size of the filter kernel when is image size is constant. Considering sorting is more complicated than summation, median filter will cost longer time.
To go down to low level to implement your own CUDA convolution function with customized filter kernel, you need to get the raw pointer of your image data
MyConv(char* image, int width, int height, int stride)
and then writing CUDA code.
Here's a tutorial on cuda convolution.
http://igm.univ-mlv.fr/~biri/Enseignement/MII2/Donnees/convolutionSeparable.pdf
This question also gives an example.
Upvotes: 1