BillBokeey
BillBokeey

Reputation: 3511

mldivide versus (LU & linsolve)

This question might be too broad to be posted here but I'll try to be as specific as possible. If you still consider it to be too broad, I'll simply delete it.

My problem :

I'm solving dynamic equations by using a Newmark scheme (2nd order implicit), which involves solving a lot of linear systems of the form A*x=b for x.

I've already optimized all the code that doesn't involve solving linear systems. As it stands now, the linear systems solving take up to 70% of the calculation time in the process.

I've though using MATLAB's linsolve, but my matrix A doesn't have any of the properties that could be used as opts input for linsolve.

The idea :

As seen in the documentation of linsolve :

If A has the properties in opts, linsolve is faster than mldivide, because linsolve does not perform any tests to verify that A has the specified properties

As far as I know, by using mldivide, MATLAB will use LU decomposition as my matrix A doens't have any specific property except for being square.

My question :

So I'm wondering if I'd gain some time by first decomposing A using MATLAB's lu, and then feed these to linsolve in order to solve x = U\(L\b) with opts being respectively upper and lower triangular. That way I'd prevent MATLAB of doing all the properties checking that takes place during the mldivide process.

Note : I'm absolutely not expecting a huge time gain. But on calculations that take up to a week, even 2% matter..

Now why don't I try this myself you may ask? Well I've got calculations running until tuesday approximatively, and I'd want to ask if someone has already tried this and gained time, getting rid of the overhead due to matrix property checking by mldivide.

Toy example :

A=randn(2500);
% Getting A to be non singular
A=A.'*A;
x_=randn(2500,1);
b=A*x_;
clear x_

% Case 1 : mldivide
tic
for ii=1:100

    x=A\b;

end
out=toc;
disp(['Case 1 time per iteration :' num2str((out)/100)]);

% Case 2 : LU+linsolve

opts1.LT=true;
opts2.UT=true;

tic;
for ii=1:100

    [L,U]=lu(A);

    % It seems that these could be directly replaced by U\(L\b) as mldivide check for triangularity first
    Tmp=linsolve(L,b,opts1);
    x=linsolve(U,Tmp,opts2);

end
out2=toc;

disp(['Case 2 time per iteration :' num2str((out2)/100)]);

EDIT

So I just had the possibility to try a few things.

I missed earlier in the documentation of linsolve that if you don't specify any opts input it will default to using the LU solver, which is what I want. Doing a bit of time testing with it (And taking into account @rayryeng 's remark to "timeit that bad boy"), it saves around 2~3% of processing time when compared to mldivide, as shown below. It's not a huge deal in term of time gain, but it's something non neglictible on calculations that take up to a week.

timeit results on a 1626*1626 linear system:

mldivide :

 t1 =

   0.102149773097083   

linsolve :

t2 =

   0.099272037768204

relative : 0.028171725121151

Upvotes: 4

Views: 1653

Answers (1)

Ander Biguri
Ander Biguri

Reputation: 35525

I know you do not have NVIDIA GPU and parallel computing toolbox, but if you had, this would work:

If you replace the second test in your code by:

tic;

for ii=1:10
        A2=gpuArray(A); % so we account for memory management
        b2=gpuArray(b);
      x=A2\b2;
end
out2=toc;

My PC says (CPU vs GPU)

Case 1 time per iteration :0.011881
Case 2 time per iteration :0.0052003

Upvotes: 4

Related Questions