Reputation: 3944
I'm implementing separable convolution to speed up 2D Gaussian convolution.
clear all;
close all;
im = randi([0,255],1024,1024);
win = 7;
window = fspecial('gaussian',win,win/6);
[U, S, V] = svd(window);
v = U(:,1) * sqrt(S(1,1));
h = V(:,1)' * sqrt(S(1,1));
out1 = filter2(window, im);
out2 = filter2(h, filter2(v, im));
norm(out1 - out2)
tic
for i = 1:1000
out1 = filter2(window, im);
end
toc
tic
for i = 1:1000
out2 = filter2(h, filter2(v, im));
end
toc
The separable version is supposed to be faster by win*win/(win + win) = 2.5 times, but it is actually slower:
ans =
2.6250e-12
Elapsed time is 5.486270 seconds.
Elapsed time is 8.769868 seconds.
Was there any hidden implementation inside filter2?
Upvotes: 0
Views: 286
Reputation: 5722
Matlab's implementation of the built-in conv
set of functions (including conv2
) is sub-optimal. You can read about it (and the workarounds) here: http://undocumentedmatlab.com/blog/convolution-performance.
The basic idea is to use either the convolution theorem or a C++-based MEX implementation, or preferably both, as used by Bruno Luong's convfft utility.
Upvotes: 0
Reputation: 4077
First of all, if you look at the source code of filter2
, you can see that it is already implemented using the same separable SVD approach. This is the gist of what filter2
does in your case (but you really should just step through the code):
[U, S, V] = svd(window,'econ');
v = U(:,1) * sqrt(S(1,1));
h = V(:,1)' * sqrt(S(1,1));
out2 = conv2(v, h, im, 'same');
What you do is the same, but instead of using a more efficient single call conv2(v, h, im, 'same')
, you end up doing recursive call conv2(conv2(im, v, 'same'), h, 'same')
, which is clearly what makes it slower.
Upvotes: 1