Reputation:
I'm trying to vectorize or make this loop run faster (it's a minimal code):
n=1000;
L=100;
x=linspace(-L/2,L/2);
V1=rand(n);
for i=1:length(x)
for k=1:n
for j=1:n
V2(j,k)=V1(j,k)*log(2/L)*tan(pi/L*(x(i)+L/2)*j);
end
end
V3(i,:)=sum(V2);
end
would appreciate you help.
Upvotes: 1
Views: 219
Reputation: 26069
here's a vectorized solution using meshgrid
, bsxfun
and repmat
:
% fast preallocation
jj(n,n)=0; B(n,n,L)=0; V3(L,n)=0;
lg=log(2/L);
% the vectorizaion part
jj=meshgrid(1:n);
B=bsxfun(@times,ones(n),permute(x,[3 1 2]));
V3=squeeze(sum(lg*repmat(V1,1,1,numel(x)).*tan(bsxfun(@times,jj',pi/L*(B+L/2))),1)).';
Running your code at my computer using tic\toc took ~25 seconds. The bsxfun code took ~4.5 seconds...
Upvotes: 0
Reputation: 1603
An alternative to vectorization, is to recognize the expensive operations in the code and somehow reduce them. For instance, the log(2/L)
is called 100*1000*1000 times with input that does not depend on any of the three for
loops. If we calculate this value outside of the for
loops, then we can use it instead:
logResult = log(2/L);
and
V2(j,k)=V1(j,k)*log(2/L)*tan(pi/L*(x(i)+L/2)*j);
becomes
V2(j,k)=(V1(j,k)*logResult*tan(pi/L*(x(i)+L/2)*j));
Likewise, the code calls the tan
function the same 100*1000*1000 times. Note how this calculation, tan(pi/L*(x(i)+L/2)*j)
does not depend on k
. And so if we calculate these values outside of the for
loops, we can reduce this calculation by 1000 times:
tanValues = zeros(lenx,n);
for i=1:lenx
for j=1:n
tanValues(i,j) = tan(pi/L*(x(i)+L/2)*j);
end
end
and the calculation for V2(j,k)
becomes
V2(j,k)=V1(j,k)*logResult*tanValues(i,j);
Also, memory can be pre-allocated to the V2
and V3
matrices to avoid the internal resizing that occurs on each iteration. Just do the following outside the for
loops
V2 = zeros(n,n);
V3 = zeros(lenx,n);
Using tic
and toc
reduces the original execution from ~14 seconds to ~6 on my workstation. This is still three times slower than natan's solution which is ~2 seconds for me.
Upvotes: 2