adi
adi

Reputation: 41

Multiprocessing hangs when Number of processes is increased

I am trying to understand multiprocessing in python.

I made a test program that finds the max number from a set of lists. It works fine for a limited number of processes, but at some point the program hangs if I increase the number to say 5000 processes.

Am I doing something wrong? Why does it hang if I increase the number of processes?

Here is my code:

from  multiprocessing import Process, Manager
import numpy.random as npyrnd

def getMaxRand(_num, shared_dict):
    '''
    create a list of random numbers
    picks max from list
    '''
    print 'starting process num:', _num
    rndList = npyrnd.random(size= 100)
    maxrnd = max(rndList)
    print 'ending process:', _num
    shared_dict[_num] = maxrnd



if __name__ == '__main__':
    processes = []
    manager = Manager()
    shared_dict= manager.dict()  
    for i in range(50): #hangs when this is increased to say 5000
        p = Process(target=getMaxRand, args=( i, shared_dict))
        processes.append(p)
    for p in processes:
        p.start()
    for p in processes:
        p.join()


    print shared_dict

EDIT:Having read some of the responses, Its clear that I can not just arbitrarily create many processes, and hope that multiprocessing library queues them for me. So a follow up question for me is how can I determine a max number of processes that i can run simultaneously?

Upvotes: 2

Views: 4473

Answers (1)

adi
adi

Reputation: 41

I managed to overcome the large number of processes hanging my PC. It appears to be working for a fairly large number of processes (I tested upto 50000)

This is how i approached the problem:

from  multiprocessing import  Pool
import numpy.random as npyrnd


full_result = {}

def getMaxRand(_num):
    '''
    create a list of random numbers
    picks max from list
    '''
    print 'starting process num:', _num
    rndList = npyrnd.random(size= 100)
    maxrnd = max(rndList)
    print 'ending process:', _num

    return (_num, maxrnd)

def accumulateResults(result):
    print 'getting result' , result
    full_result[result[0]] = result[1]

def doProcesses():
    pool = Pool(processes=8)    
    for i in range(5000): #if I increase this number will it crash?
        pool.apply_async(getMaxRand, args=( i, ), callback=accumulateResults)
    pool.close()
    pool.join()



if __name__ == '__main__':
    doProcesses()
    print 'FINAL:', full_result

Thanks @mgilson and @Kylo for pointing me in this direction.

Upvotes: 2

Related Questions