Uri
Uri

Reputation: 27016

How to run a python program that uses numpy on multiple cores, preferably using threading

I want to do parallel computation and return the result to the main thread. Since this is done a lot of times, I assume the overhead of process to process messaging will hinder performance (is this assumption correct?), so I want to use threads.

As I understand - threads will run on different cores only if I'm using jython or ironpython (which is better?).

Assuming this is correct - all I have to do is switch my eclipse interpreter to one of the above?

Finally, I'm using numpy. Is this a problem? Will the implementation of jython/ironpython hinder numpy performance?

Update:

I am now trying to use multiprocesses as per the recommendations below. I'm having trouble passing the arguments in a neat way (also for some reason when I stop the application from running the processes opened do not close and I have to restart my computer!). This is what I'm trying to do:

pool = multiprocessing.Pool()
results = pool.map(my_class(param1=bla1, param2=bla2), list_args)

Where list_args is a list of arguments for __call__ function of the class my_class, and bla1 and bla2 are numpy arrays.

Queries:

  1. The default value for pool is cpu_count(). I assume this is optimal?

  2. Why is this not working? (The processes do not seem to return...)

Upvotes: 2

Views: 375

Answers (1)

Uri
Uri

Reputation: 27016

I was missing a main function to wrap everything. Apparently this is important for multiprocessing.

def main():
    run_my_stuff()

if __name__ == "__main__":
    main()

Also - the same effect can be caused if the queue your processes are using is full (it has a certain capacity), so it could help to change the code so that it allows pulling items from the queue at the same time when they are added.

Upvotes: 1

Related Questions