Reputation: 19
Dear community of stack overflow. Is it possible to improve the speed of this code in Matlab? Can I use vectorization? Note that I have to use in every loop the "vpasolve" or "fsolve" function.
Here is the code which calls the function with a double loop:
for i=1:1000
for j=1:1000
SilA(i,j)=SolW(i,j);
end
end
Here is the function which is called by the above code. Can I vectorize the input of w, xi and make the code run faster?
function [ffw] = SolW(w,xi)
format long e;
z=0;mm=0.46;sop=80;
epit=0.1;voP=80.;
rho=2.1;aposC=0.1;aposD=0.1;
parEh=0.2*10^6;parEv=0.2*10^6;parGv=0.074*10^6;
parpos=0.35;hp=0.2;Ep=30*10^6;
parposV=0.20;ll=0.15;dd=2*ll;
C11=(parEh*(parEv-parEh*parpos^2)/((1+parpos)*(parEv-parEv*parpos-2*parEh*parpos^2)))*(1+2*1i*aposD);
C13=(parEh*parEv*parpos/((parEv-parEv*parpos-2*parEh*parpos^2)))*(1+2*1i*aposD);
C33=((1-parpos)*parEv^2/(parEv-parEv*parpos-2*parEh*parpos^2))*(1+2*1i*aposD);
C44=parGv*(1+2*1i*aposD);
DD=(Ep*hp^3)/(12*(1-parposV^2));
a1=C44;
a2=C33;
c1=(C13+C44)*1i*xi;
c2=(C13+C44)*1i*xi;
b1=rho*w^2-C11*xi^2;
b2=rho*w^2-C44*xi^2;
syms xr
rsol=vpasolve((a1*xr+b1).*(a2*xr+b2)-c1*c2*xr==0,xr);
rsol=eval(rsol);
r=[-sqrt(rsol)];
r1=r(1,:);
r2=r(2,:);
Fdf1=sop*sqrt(2*pi/(1i*epit*xi))*exp(1i*(xi*voP+w)^2/(2*epit*xi));
BC11=C44*(r1-1i*xi*c2*r1/(a2*r1^2+b2));
BC21=(C13*1i*xi)-((C33*c2*r1^2)/(a2*r1^2+b2))+(DD*xi^4-mm*w^2+1i*aposC*w)*c2*r1/(a2*r1^2+b2);
BC12=C44*(r2-1i*xi*c2*r2/(a2*r2^2+b2));
BC22=(C13*1i*xi)-((C33*c2*r2^2)/(a2*r2^2+b2))+(DD*xi^4-mm*w^2+1i*aposC*w)*c2*r2/(a2*r2^2+b2);
syms As1 As2;
try
[Ass1,Ass2]=vpasolve(BC11*As1+BC12*As2==0,BC21*As1+BC22*As2+Fdf1==0,As1,As2);
A1=eval(Ass1);
A2=eval(Ass2);
catch
A1=0.0;
A2=0.0;
end
Bn1=-(c2*r1/(a2*r1^2+b2))*A1;
Bn2=-(c2*r2/(a2*r2^2+b2))*A2;
ffw=Bn1*exp(r1*z)+Bn2*exp(r2*z);
end
Upvotes: 0
Views: 179
Reputation: 19
In the above program everything can be vectorized as @beesleep said above. For example:
[I, J] = meshgrid(1:1000, 1:1000)
c1=(C13+C44)*1i.*xi;
c2=(C13+C44)*1i.*xi;
b1=rho.*w.^2 - C11.*xi.^2;
b2=rho.*w.^2-C44.*xi.^2;
The vpasolve part ,i.e.,
syms xr
rsol=vpasolve((a1*xr+b1).*(a2*xr+b2)-c1*c2*xr==0,xr);
rsol=eval(rsol);
r=[-sqrt(rsol)];
r1=r(1,:);
r2=r(2,:);
can also be vectorized by using fsolve as it is shown here:
fun=@(Xr) (a1.*Xr+b1).*(a2.*Xr+b2)-c1.*c2.*Xr
x01=-ones(2*Nxi);
x02=ones(2*Nxi);
options.Algorithm= 'trust-region-reflective'
options.JacobPattern=speye(4*Nxi^2);
options.PrecondBandWidth=0;
[rsol1]=fsolve(fun,x01,options);
[rsol2]=fsolve(fun,x02,options);
The other part, i.e,
syms As1 As2;
try
[Ass1,Ass2]=vpasolve(BC11*As1+BC12*As2==0,BC21*As1+BC22*As2+Fdf1==0,As1,As2);
A1=eval(Ass1);
A2=eval(Ass2);
catch
A1=0.0;
A2=0.0;
end
since contains linear equations and in each row has two dependent equations can be solved as it is shown here:
funAB1=@(As1) BC11.*As1+BC12.*((-Fdf2-BC21.*As1)./BC22);
x0AB=ones(2*Nxi)+1i*ones(2*Nxi);
options.Algorithm= 'trust-region-reflective';
options.JacobPattern=speye(4*Nxi^2);
options.PrecondBandWidth=0;
[A1]=fsolve(funAB1,x0AB,options);
A2=((-Fdf2-BC21.*A1)./BC22);
However, both can also be solved analytically, i.e.,
AAr=a1.*a2;
BBr=a1.*b2+b1.*a2-c1.*c2;
CCr=b1.*b2;
Xr1=(-BBr+sqrt(BBr^.2-4.*AAr.*CCr))./(2.*AAr);
Xr2=(-BBr-sqrt(BBr^.2-4.*AAr.*CCr))./(2.*AAr);
r1=-sqrt(Xr1(:,:));
r2=-sqrt(Xr2(:,:));
Upvotes: 0
Reputation: 1362
Everything in your function but the calls to vpasolve
, and try....
can be vectorize.
First, all this does not depend on w
or xi
, so could be computed once only:
format long e;
z=0;mm=0.46;sop=80;
epit=0.1;voP=80.;
rho=2.1;aposC=0.1;aposD=0.1;
parEh=0.2*10^6;parEv=0.2*10^6;parGv=0.074*10^6;
parpos=0.35;hp=0.2;Ep=30*10^6;
parposV=0.20;ll=0.15;dd=2*ll;
C11=(parEh*(parEv-parEh*parpos^2)/((1+parpos)*(parEv-parEv*parpos-2*parEh*parpos^2)))*(1+2*1i*aposD);
C13=(parEh*parEv*parpos/((parEv-parEv*parpos-2*parEh*parpos^2)))*(1+2*1i*aposD);
C33=((1-parpos)*parEv^2/(parEv-parEv*parpos-2*parEh*parpos^2))*(1+2*1i*aposD);
C44=parGv*(1+2*1i*aposD);
DD=(Ep*hp^3)/(12*(1-parposV^2));
a1=C44;
a2=C33;
From what I know, eval
is slow, and I'm pretty sure that you don't need it:
rsol=eval(rsol);
Here is an example of vectorization. You should first generate all indices combination using meshgrid, and then use the .
to noticed matlab to use element wise operations:
[I, J] = meshgrid(1:1000, 1:1000)
c1=(C13+C44)*1i.*xi;
c2=(C13+C44)*1i.*xi;
b1=rho.*w.^2 - C11.*xi.^2;
b2=rho.*w.^2-C44.*xi.^2;
But you won't be able to vectorize vpasolve
, and try....
litteraly, without changing it to something else. vpasolve
is probably the bottleneck of you computation (verify this using matlab profiler), so optimizing as proposed above will probably not reduce your computation time much.
Then you have several solutions:
parfor
if you have access to it to parallelize your computations, which depending on your architecture, could give you a 4x speedup or so.vpasolve
. This will probably give you the fastest speedup (guessing factor 10 -100 ?)Hope this helps
Upvotes: 1