Yan Song
Yan Song

Reputation: 110

Comparision of Speed of matrix multiplication in matlab and C

We can do matrix multiplication in several ways. A and B are two matrix, of size 1000 * 1000.

  1. Use the matlab builtin matrix operator. A*B
  2. Write two nests explicitly.
  3. Write three nests explicitly.
  4. Dynamically link C code to do the three nests.

The result is

  1. * operator is very fast. It takes about 1 second.
  2. Nests are very slow, especially three nests, 500 seconds.
  3. C Loops are much faster than the matlab nests, 10 seconds.

Can anyone explain the following. Why * is so fast and why C loops is more efficient than write the loops in matlab.

I used another software , not matlab , to do the computing. But I think similar reasoning should apply.

Upvotes: 1

Views: 743

Answers (3)

edgarmtze
edgarmtze

Reputation: 25058

There is a webpage that describes Matrix multiplication speeds in C using naive code and Blas code:

Naïve code

for (i = 0; i < n; i++)
    for (j = 0; j < n; j++) {
        C[i + j*n] = 0;
        for (k = 0; k < n; k++)
            C[i + j*n] += A[i + k*n]*B[k + n*j];
    }

BLAS code

dgemm_(&charN, &charN, &n, &n, &n, &one_d, A, &n, B, &n, &zero_d, C, &n);

showing enter image description here enter image description here enter image description here

Maybe a good option to use those libraries and improve your speed.

Upvotes: 1

LaszloLadanyi
LaszloLadanyi

Reputation: 973

matlab is an interpreted language, C is compiled. Therefore C loops are much faster than matlab loops. That explains the difference between 2 and 3.

Similarly, matlab's A*B is also a compiled code under the hood. Then why is it still an order of magnitude faster than the C code? After all, if it's just a nested code that's really simple, and the compiler can optimize it to be as fast as if you were writing it in assembly. Well, the answer is that it is most likely not just nested loops. The simple nested loop algorithm runs in O(n^3) time, but if you recursively break up the matrix then you can relatively easily get the O(n^2.8) Strassen algorithm (http://en.wikipedia.org/wiki/Strassen_algorithm) which performs well in practice; or you can even get down to the O(n^2.37) Coppersmith-Winograd algorithm (http://en.wikipedia.org/wiki/Coppersmith%E2%80%93Winograd_algorithm) which is not quite practical.

My bet is that matlab uses some form of the Strassen algorithm, which not only has a better asymptotic speed, but because the submatrices it eventially multiplies are small, it has a better cache utilization as well.

Upvotes: 2

Oliver Charlesworth
Oliver Charlesworth

Reputation: 272687

Because matrix multiplication is highly optimised in Matlab, using LAPACK libraries under the hood.

You will find it hard to beat the performance of those libraries. Certainly, simple nested loops don't take cache effects into account, and will thus exhibit terrible performance.

Upvotes: 3

Related Questions