Reputation: 13
I have two signals x1 and x2. I'm trying to do convolution once using CONV(x1,X2) directly and once using fft and ifft, and compare execution time of both operations.
I don't know why the execution time of fft is not faster than using conv.
n = 0: 2^15 ;
x1(n+1) = (0.25).^n ;
x2(n+1) = 1;
tic
Time_Convolution = conv(x1,x2);
toc
Padding = (length(x1)+ length(x2) - 1)-length(x1) ;
x1_before_fft = [ x1 zeros(1, Padding) ];
x2_before_fft = [ x2 zeros(1, Padding)];
tic
Convolution = ifft(fft( x1_before_fft).*fft(x2_before_fft));
toc
Here is the output
Elapsed time is 0.010414 seconds. Elapsed time is 0.017308 seconds.
Upvotes: 0
Views: 235
Reputation: 781
The key here is that one of your input signals contains mostly zeros, as you're losing precision to roundoff from raising 0.25 to large powers:
>> nnz(x1) / numel(x1)
ans =
0.0164
Less than 2% of your first input is non-zero. A direct implementation of convolution can exploit this to get rid of a lot of operations that contribute nothing to the result. An FFT-based convolution, however, will always do just about the same amount of work regardless of the magnitude of the coefficients involved.
The story is quite different when you make those coefficients non-zero. On my machine, I see:
>> tic; for k = 1:100, conv(x1,x2); end, toc
Elapsed time is 0.291373 seconds.
>> x1 = x1 + eps;
>> tic; for k = 1:100, conv(x1,x2); end, toc
Elapsed time is 3.937819 seconds.
Another thing to note is that you are using an unfortunate choice of FFT size to do the convolution via FFT. Any transform that uses at least length(x1)+length(x2)-1
points will do the trick (you just have to trim off any extra coefficients at the end if you have a larger transform), so it's best to pick one that has small prime factors. In this case, however, length(x1)+length(x2)-1
is itself prime, so it's the worst possible choice. Look at the difference you can see by just increasing that length by 1:
>> N = length(x1)+length(x2)-1; % Original size.
>> tic; for k = 1:100, ifft(fft( x1, N).*fft(x2,N)); end, toc
Elapsed time is 1.036913 seconds.
>> N = length(x1)+length(x2); % Better size.
>> tic; for k = 1:100, ifft(fft( x1, N).*fft(x2,N)); end, toc
Elapsed time is 0.289473 seconds.
Of course, you can do even better if you keep reducing the factors of N:
>> N = length(x1)+length(x2)
N =
65538
>> while max(factor(N)) > 7, N = N + 2; end
>> N
N =
65610
>> tic; for k = 1:100, ifft(fft( x1, N).*fft(x2,N)); end, toc
Elapsed time is 0.250967 seconds.
So just by picking a better transform size, you get a 4x speedup, and now even with the ability of conv
to optimize for all of those zeros, you'll get better speed.
Upvotes: 1