Reputation: 3511
This question might be too broad to be posted here but I'll try to be as specific as possible. If you still consider it to be too broad, I'll simply delete it.
I'm solving dynamic equations by using a Newmark scheme (2nd order implicit), which involves solving a lot of linear systems of the form A*x=b
for x
.
I've already optimized all the code that doesn't involve solving linear systems. As it stands now, the linear systems solving take up to 70% of the calculation time in the process.
I've though using MATLAB's linsolve
, but my matrix A
doesn't have any of the properties that could be used as opts
input for linsolve
.
As seen in the documentation of linsolve
:
If A has the properties in opts, linsolve is faster than mldivide, because linsolve does not perform any tests to verify that A has the specified properties
As far as I know, by using mldivide
, MATLAB will use LU decomposition as my matrix A
doens't have any specific property except for being square.
So I'm wondering if I'd gain some time by first decomposing A
using MATLAB's lu
, and then feed these to linsolve in order to solve x = U\(L\b)
with opts
being respectively upper and lower triangular.
That way I'd prevent MATLAB of doing all the properties checking that takes place during the mldivide
process.
Note : I'm absolutely not expecting a huge time gain. But on calculations that take up to a week, even 2% matter..
Now why don't I try this myself you may ask? Well I've got calculations running until tuesday approximatively, and I'd want to ask if someone has already tried this and gained time, getting rid of the overhead due to matrix property checking by mldivide.
A=randn(2500);
% Getting A to be non singular
A=A.'*A;
x_=randn(2500,1);
b=A*x_;
clear x_
% Case 1 : mldivide
tic
for ii=1:100
x=A\b;
end
out=toc;
disp(['Case 1 time per iteration :' num2str((out)/100)]);
% Case 2 : LU+linsolve
opts1.LT=true;
opts2.UT=true;
tic;
for ii=1:100
[L,U]=lu(A);
% It seems that these could be directly replaced by U\(L\b) as mldivide check for triangularity first
Tmp=linsolve(L,b,opts1);
x=linsolve(U,Tmp,opts2);
end
out2=toc;
disp(['Case 2 time per iteration :' num2str((out2)/100)]);
So I just had the possibility to try a few things.
I missed earlier in the documentation of linsolve
that if you don't specify any opts
input it will default to using the LU
solver, which is what I want. Doing a bit of time testing with it (And taking into account @rayryeng 's remark to "timeit
that bad boy"), it saves around 2~3% of processing time when compared to mldivide
, as shown below. It's not a huge deal in term of time gain, but it's something non neglictible on calculations that take up to a week.
timeit
results on a 1626*1626
linear system:
mldivide
:
t1 =
0.102149773097083
linsolve
:
t2 =
0.099272037768204
relative : 0.028171725121151
Upvotes: 4
Views: 1653
Reputation: 35525
I know you do not have NVIDIA GPU and parallel computing toolbox, but if you had, this would work:
If you replace the second test in your code by:
tic;
for ii=1:10
A2=gpuArray(A); % so we account for memory management
b2=gpuArray(b);
x=A2\b2;
end
out2=toc;
My PC says (CPU vs GPU)
Case 1 time per iteration :0.011881
Case 2 time per iteration :0.0052003
Upvotes: 4