Reputation: 41
I am trying to understand multiprocessing in python.
I made a test program that finds the max number from a set of lists. It works fine for a limited number of processes, but at some point the program hangs if I increase the number to say 5000 processes.
Am I doing something wrong? Why does it hang if I increase the number of processes?
Here is my code:
from multiprocessing import Process, Manager
import numpy.random as npyrnd
def getMaxRand(_num, shared_dict):
'''
create a list of random numbers
picks max from list
'''
print 'starting process num:', _num
rndList = npyrnd.random(size= 100)
maxrnd = max(rndList)
print 'ending process:', _num
shared_dict[_num] = maxrnd
if __name__ == '__main__':
processes = []
manager = Manager()
shared_dict= manager.dict()
for i in range(50): #hangs when this is increased to say 5000
p = Process(target=getMaxRand, args=( i, shared_dict))
processes.append(p)
for p in processes:
p.start()
for p in processes:
p.join()
print shared_dict
EDIT:Having read some of the responses, Its clear that I can not just arbitrarily create many processes, and hope that multiprocessing library queues them for me. So a follow up question for me is how can I determine a max number of processes that i can run simultaneously?
Upvotes: 2
Views: 4473
Reputation: 41
I managed to overcome the large number of processes hanging my PC. It appears to be working for a fairly large number of processes (I tested upto 50000)
This is how i approached the problem:
from multiprocessing import Pool
import numpy.random as npyrnd
full_result = {}
def getMaxRand(_num):
'''
create a list of random numbers
picks max from list
'''
print 'starting process num:', _num
rndList = npyrnd.random(size= 100)
maxrnd = max(rndList)
print 'ending process:', _num
return (_num, maxrnd)
def accumulateResults(result):
print 'getting result' , result
full_result[result[0]] = result[1]
def doProcesses():
pool = Pool(processes=8)
for i in range(5000): #if I increase this number will it crash?
pool.apply_async(getMaxRand, args=( i, ), callback=accumulateResults)
pool.close()
pool.join()
if __name__ == '__main__':
doProcesses()
print 'FINAL:', full_result
Thanks @mgilson and @Kylo for pointing me in this direction.
Upvotes: 2