Geoff
Geoff

Reputation: 8145

Matrix multiplication running times Python < C++ < Matlab - Explain

I have a matrix M thats's 16384 x 81. I want to compute M * M.t (the result will be 16384x16384).

My question is: could somebody please explain the running time differences?

Using OpenCV in C++ the following code takes 18 seconds

#include <cv.h>
#include <cstdio>
using namespace cv;
int main(void) {
  Mat m(16384, 81, CV_32FC1);
  randu(m, Scalar(0), Scalar(1));
  int64 tic = getTickCount();
  Mat m2 = m * m.t();
  printf("%f", (getTickCount() - tic) / getTickFrequency());
}

In Python the following code takes only 0.9 seconds 18.8 seconds (see comment below)

import numpy as np
from time import time
m = np.random.rand(16384, 81)
tic = time()
result = np.dot(m, m.T)
print (time() - tic)

In MATLAB the following code takes 17.7 seconds

m = rand(16384, 81); 
tic;
result = m * m';
toc;

My only guess would have been that it's a memory issue, and that somehow Python is able to avoid swap space. When I watch top, however, I do not see my C++ application using all the memory, and I had expected that C++ would win the day. Thanks for any insights.

Edit

After revising my examples to time only the operation, the code now takes 18 seconds with Python, also. I'm really not sure what's going on, but if there's enough memory, they all seem to perform the same now.

Here are timings if the number of rows is 8192: C++: 4.5 seconds Python: 4.2 seconds Matlab: 1.8 seconds

Upvotes: 0

Views: 1960

Answers (1)

Ben Voigt
Ben Voigt

Reputation: 283793

What CPU are you running on? For modern x86 and x64 chips with dynamic clocking, getTickCount and getTickFrequency cannot be trusted.

18 seconds is long enough to get acceptable precision from the standard OS functions based on the timer interrupt.

And what BLAS are you using with OpenCV? MatLab installs some highly optimized ones, IIRC even detecting your CPU and loading either Intel's or AMD's math library appropriately.

Upvotes: 3

Related Questions