Amcum Yidwen
Amcum Yidwen

Reputation: 31

Parallel computing

I have a two dimensional table (Matrix) I need to process each line in this matrix independently from the others. The process of each line is time consuming. I'd like to use parallel computing resources in our university (Canadian Grid something)

Can I have some advise on how to start ? I never used parallel computing before.

Thanks :)

Upvotes: 2

Views: 1254

Answers (5)

bmu
bmu

Reputation: 36224

From what you describe, I would say: first have a look at numpy. Numpy provides methods to compute the columns and rows in a vectorized manner with nearly C speed. Depending on your problem, this could be faster than parallel computation with pure CPython.

You can than use parallel computing with numpy-arrays to get a really big speed up. Possible ways to do this is using multiprocessing or Ipython on a cluster.

Upvotes: 0

Gourab Chakraborty
Gourab Chakraborty

Reputation: 1

It is recommended that you use C++/C for performing this computation. You can use the OpenMP API using the #include<omp.h> header. You can start your parallel region using the #pragma amp parallel directive. Since you are parallelising a for-loop for computing your matrix multiplication, you can use #pragma omp parallel for { } to start your for-loop inside the parallel region. OpenMP will automatically take care of the process synchronisation.

Check this out for a sample code: https://gist.github.com/metallurgix/0dfafc03215ce89fc595

Remember to use a big matrix to see actual improvements in speed. A smaller matrix will perform poorly in fact due to increased task overhead created due to forking and joining the multiple threads.

You can also check out MPI if you want to parallelise your code using multiple processors instead of multiple threads.

Upvotes: 0

Olivier
Olivier

Reputation: 153

I am one of the developper of a new library called scoop.

It was built exactly for this purpose (grid or super-computing, scientific computing). I suggest you give it a try.

In your case, all you would have to do is a call like this:

futures.map(YourFunc, matrixLine)

It will then be distributed on your grid or whatever environment you choose.

Upvotes: 5

Sideshow Bob
Sideshow Bob

Reputation: 4716

Like the commentators have said, find someone to talk to in your university. The answer to your question will be specific to what software is installed on the grid. If you have access to a grid, it's highly likely you also have access to a person whose job it is to answer your questions (and they will be pleased to help) - find this person!

Upvotes: 0

S.Lott
S.Lott

Reputation: 392010

Start here: http://docs.python.org/library/multiprocessing.html

Be sure to read this: http://docs.python.org/library/multiprocessing.html#examples

This may be helpful: http://www.slideshare.net/pvergain/multiprocessing-with-python-presentation.

While excellent, it includes threads and multiprocessing, even though multiprocessing is often far, far superior to attempting multi-threading.

For Grid computing, multi-threading is largely useless.

Also, you probably also want to read up on celery.

Upvotes: 6

Related Questions