Edmond Muho
Edmond Muho

Reputation: 19

How to improve the speed of this code with double loops in Matlab? Is it possible?

Dear community of stack overflow. Is it possible to improve the speed of this code in Matlab? Can I use vectorization? Note that I have to use in every loop the "vpasolve" or "fsolve" function.

Here is the code which calls the function with a double loop:

   for i=1:1000
        for j=1:1000
            SilA(i,j)=SolW(i,j);  
        end
    end 

Here is the function which is called by the above code. Can I vectorize the input of w, xi and make the code run faster?

function [ffw] = SolW(w,xi)

format long e;
z=0;mm=0.46;sop=80;
epit=0.1;voP=80.;
rho=2.1;aposC=0.1;aposD=0.1;
parEh=0.2*10^6;parEv=0.2*10^6;parGv=0.074*10^6;
parpos=0.35;hp=0.2;Ep=30*10^6;
parposV=0.20;ll=0.15;dd=2*ll;

 C11=(parEh*(parEv-parEh*parpos^2)/((1+parpos)*(parEv-parEv*parpos-2*parEh*parpos^2)))*(1+2*1i*aposD);
C13=(parEh*parEv*parpos/((parEv-parEv*parpos-2*parEh*parpos^2)))*(1+2*1i*aposD);
C33=((1-parpos)*parEv^2/(parEv-parEv*parpos-2*parEh*parpos^2))*(1+2*1i*aposD);
C44=parGv*(1+2*1i*aposD);
DD=(Ep*hp^3)/(12*(1-parposV^2));
a1=C44;
a2=C33;
c1=(C13+C44)*1i*xi;
c2=(C13+C44)*1i*xi;
b1=rho*w^2-C11*xi^2;
b2=rho*w^2-C44*xi^2;

 syms xr
rsol=vpasolve((a1*xr+b1).*(a2*xr+b2)-c1*c2*xr==0,xr);
rsol=eval(rsol);
r=[-sqrt(rsol)];
r1=r(1,:);
r2=r(2,:);


 Fdf1=sop*sqrt(2*pi/(1i*epit*xi))*exp(1i*(xi*voP+w)^2/(2*epit*xi));
BC11=C44*(r1-1i*xi*c2*r1/(a2*r1^2+b2));
BC21=(C13*1i*xi)-((C33*c2*r1^2)/(a2*r1^2+b2))+(DD*xi^4-mm*w^2+1i*aposC*w)*c2*r1/(a2*r1^2+b2);
BC12=C44*(r2-1i*xi*c2*r2/(a2*r2^2+b2));
BC22=(C13*1i*xi)-((C33*c2*r2^2)/(a2*r2^2+b2))+(DD*xi^4-mm*w^2+1i*aposC*w)*c2*r2/(a2*r2^2+b2);

 syms As1 As2;
   try
[Ass1,Ass2]=vpasolve(BC11*As1+BC12*As2==0,BC21*As1+BC22*As2+Fdf1==0,As1,As2);
       A1=eval(Ass1);
       A2=eval(Ass2);
   catch
       A1=0.0;
       A2=0.0;
   end


Bn1=-(c2*r1/(a2*r1^2+b2))*A1;
Bn2=-(c2*r2/(a2*r2^2+b2))*A2;
ffw=Bn1*exp(r1*z)+Bn2*exp(r2*z);

end

Upvotes: 0

Views: 179

Answers (2)

Edmond Muho
Edmond Muho

Reputation: 19

In the above program everything can be vectorized as @beesleep said above. For example:

[I, J] = meshgrid(1:1000, 1:1000)
c1=(C13+C44)*1i.*xi;
c2=(C13+C44)*1i.*xi;
b1=rho.*w.^2 - C11.*xi.^2;
b2=rho.*w.^2-C44.*xi.^2;   

The vpasolve part ,i.e.,

 syms xr
 rsol=vpasolve((a1*xr+b1).*(a2*xr+b2)-c1*c2*xr==0,xr);
 rsol=eval(rsol);
 r=[-sqrt(rsol)];
 r1=r(1,:);
 r2=r(2,:);

can also be vectorized by using fsolve as it is shown here:

fun=@(Xr) (a1.*Xr+b1).*(a2.*Xr+b2)-c1.*c2.*Xr
x01=-ones(2*Nxi);
x02=ones(2*Nxi);
options.Algorithm= 'trust-region-reflective'
options.JacobPattern=speye(4*Nxi^2);
options.PrecondBandWidth=0;
[rsol1]=fsolve(fun,x01,options);
[rsol2]=fsolve(fun,x02,options);

The other part, i.e,

 syms As1 As2;
 try
 [Ass1,Ass2]=vpasolve(BC11*As1+BC12*As2==0,BC21*As1+BC22*As2+Fdf1==0,As1,As2);
      A1=eval(Ass1);
      A2=eval(Ass2);
  catch
      A1=0.0;
      A2=0.0;
  end

since contains linear equations and in each row has two dependent equations can be solved as it is shown here:

 funAB1=@(As1) BC11.*As1+BC12.*((-Fdf2-BC21.*As1)./BC22);

 x0AB=ones(2*Nxi)+1i*ones(2*Nxi);
 options.Algorithm= 'trust-region-reflective';
 options.JacobPattern=speye(4*Nxi^2);
 options.PrecondBandWidth=0;
 [A1]=fsolve(funAB1,x0AB,options);

 A2=((-Fdf2-BC21.*A1)./BC22); 

However, both can also be solved analytically, i.e.,

AAr=a1.*a2;
BBr=a1.*b2+b1.*a2-c1.*c2;
CCr=b1.*b2;

Xr1=(-BBr+sqrt(BBr^.2-4.*AAr.*CCr))./(2.*AAr);
Xr2=(-BBr-sqrt(BBr^.2-4.*AAr.*CCr))./(2.*AAr);
r1=-sqrt(Xr1(:,:));
r2=-sqrt(Xr2(:,:));

Upvotes: 0

beesleep
beesleep

Reputation: 1362

Everything in your function but the calls to vpasolve, and try.... can be vectorize.

First, all this does not depend on w or xi, so could be computed once only:

format long e;
z=0;mm=0.46;sop=80;
epit=0.1;voP=80.;
rho=2.1;aposC=0.1;aposD=0.1;
parEh=0.2*10^6;parEv=0.2*10^6;parGv=0.074*10^6;
parpos=0.35;hp=0.2;Ep=30*10^6;
parposV=0.20;ll=0.15;dd=2*ll;

C11=(parEh*(parEv-parEh*parpos^2)/((1+parpos)*(parEv-parEv*parpos-2*parEh*parpos^2)))*(1+2*1i*aposD);
C13=(parEh*parEv*parpos/((parEv-parEv*parpos-2*parEh*parpos^2)))*(1+2*1i*aposD);
C33=((1-parpos)*parEv^2/(parEv-parEv*parpos-2*parEh*parpos^2))*(1+2*1i*aposD);
C44=parGv*(1+2*1i*aposD);
DD=(Ep*hp^3)/(12*(1-parposV^2));
a1=C44;
a2=C33;

From what I know, eval is slow, and I'm pretty sure that you don't need it:

rsol=eval(rsol);

Here is an example of vectorization. You should first generate all indices combination using meshgrid, and then use the . to noticed matlab to use element wise operations:

[I, J] = meshgrid(1:1000, 1:1000)
c1=(C13+C44)*1i.*xi;
c2=(C13+C44)*1i.*xi;
b1=rho.*w.^2 - C11.*xi.^2;
b2=rho.*w.^2-C44.*xi.^2;

But you won't be able to vectorize vpasolve, and try.... litteraly, without changing it to something else. vpasolve is probably the bottleneck of you computation (verify this using matlab profiler), so optimizing as proposed above will probably not reduce your computation time much.

Then you have several solutions:

  • use parfor if you have access to it to parallelize your computations, which depending on your architecture, could give you a 4x speedup or so.
  • it may be possible to linearize your equations and solve them all in one operation. Anyway, using a linear solver will be probably much faster than using vpasolve. This will probably give you the fastest speedup (guessing factor 10 -100 ?)
  • because you have continuous data, you could reduce the number of steps, if you dare loosing precision.

Hope this helps

Upvotes: 1

Related Questions