Rui Martins
Rui Martins

Reputation: 3888

Multiprocessing in Python to process a list of parameters

I'm writing my first multiprocessing program in python.

I want to create a list of values to be processed, and 8 processes (number os CPU cores) will consume and process the list of values.

I wrote the following python code:

__author__ = 'Rui Martins'

from multiprocessing import cpu_count, Process, Lock, Value

def proc(lock, number_of_active_processes, valor):
    lock.acquire()
    number_of_active_processes.value+=1
    print "Active processes:", number_of_active_processes.value
    lock.release()
    # DO SOMETHING ...
    for i in range(1, 100):
        valor=valor**2
    # (...)
    lock.acquire()
    number_of_active_processes.value-=1
    lock.release()

if __name__ == '__main__':
    proc_number=cpu_count()
    number_of_active_processes=Value('i', 0)
    lock = Lock()
    values=[11, 24, 13, 40, 15, 26, 27, 8, 19, 10, 11, 12, 13]
    values_processed=0

    processes=[]
    for i in range(proc_number):
        processes+=[Process()]
    while values_processed<len(values):
        while number_of_active_processes.value < proc_number and values_processed<len(values):
            for i in range(proc_number):
                if not processes[i].is_alive() and values_processed<len(values):
                    processes[i] = Process(target=proc, args=(lock, number_of_active_processes, values[values_processed]))
                    values_processed+=1
                    processes[i].start()

            while number_of_active_processes.value == proc_number:
                # BUG: always number_of_active_processes.value == 8 :(
                print "Active processes:", number_of_active_processes.value

    print ""
    print "Active processes at END:", number_of_active_processes.value

And, I have the following problem:

Upvotes: 3

Views: 388

Answers (2)

Padraic Cunningham
Padraic Cunningham

Reputation: 180550

Simplifying your code to the following:

def proc(lock, number_of_active_processes, valor):
    lock.acquire()
    number_of_active_processes.value += 1
    print("Active processes:", number_of_active_processes.value)
    lock.release()
    # DO SOMETHING ...
    for i in range(1, 100):
        print(valor)
        valor = valor **2
    # (...)
    lock.acquire()
    number_of_active_processes.value -= 1
    lock.release()


if __name__ == '__main__':
    proc_number = cpu_count()
    number_of_active_processes = Value('i', 0)

    lock = Lock()
    values = [11, 24, 13, 40, 15, 26, 27, 8, 19, 10, 11, 12, 13]
    values_processed = 0

    processes = [Process() for _ in range(proc_number)]
    while values_processed < len(values)-1:
        for p in processes:
            if not p.is_alive():
                p = Process(target=proc,
                            args=(lock, number_of_active_processes, values[values_processed]))
                values_processed += 1
                p.start()

If you run it like above the print(valor) added you see exactly what is happening, you are exponentially growing valor to the point you run out of memory, you don't get stuck in the while you get stuck in the for loop.

This is the output at the 12th process adding a print(len(srt(valor))) after a fraction of a second and it just keeps on going:

2
3
6
11
21
.........
59185
70726
68249
73004
77077
83805
93806
92732
90454
104993
118370
136498
131073

Just changing your loop to the following:

for i in range(1, 100):
    print(valor)
    valor = valor *2

The last number created is:

 6021340351084089657109340225536

Using your own code you seem to get stuck in the while but it is valor is growing in the for loop to numbers with as many digits as:

167609
180908
185464
187612
209986
236740
209986

And on....

Upvotes: 1

oxnz
oxnz

Reputation: 875

The problem is not your multiprocessing code. It's the pow operator in the for loop:

for i in range(1, 100):
        valor=valor**2

the final result would be pow(val, 2**100), and this is too big, and calculate it would cost too much time and memory. so you got out of memory error in the last.

4 GB = 4 * pow(2, 10) * pow(2, 10) * pow(2, 20) * 8 bit = 2**35 bit

and for your smallest number 8:

pow(8, 2**100) = pow(2**3, 2**100) = pow(2, 3*pow(2, 100))
pow(2, 3*pow(2, 100))bit/4GB = 3*pow(2, 100-35) = 3*pow(2, 65)

it need 3*pow(2, 65) times of 4 GB memory.

Upvotes: 0

Related Questions