John
John

Reputation: 475

Different results of using Thread and Process

I wrote very simple code:

n = 0
def calculate_n(number):
    global n
    for i in range(number):
        n += 1
    print n

def print_n():
    global n
    print "n= "
    print n

and in main:

if __name__ == '__main__':
    number = 1000000
    t1 = Process(target=calculate_n, args=(number,))
    t1.start()
    t2 = Process(target=calculate_n, args=(number,))
    t2.start()
    print_n()

It gives the result:

n = 1000000

n = 1000000

As it should be. When I change code in main to this case:

number = 1000000
t1 = Thread(target=calculate_n, args=(number, ))
t1.start()
t2 = Thread(target=calculate_n, args=(number,))
t2.start()

I am getting different results all the time:

n = 1388791

n = 1390167


n = 1426284

n = 1427452


n = 1295707

n = 1297116


and so on.

So the first case is rather simple. When we execute Process, the code runs in a different ones and two different processes use "different" global variable n, and I am getting always expected result: 1000000 and 1000000.

When we execute it in Threads, they somehow split global variable n, but the thing that I cannot understand why the result is always different....?

Hope I did explain it transparent and you will help..

Thank you in advance!

P.S.

Most important! and why it is not 2 000 000?

The result should be 1 000 000 + 1 000 000 = 2 000 000

Upvotes: 1

Views: 395

Answers (1)

Mike Müller
Mike Müller

Reputation: 85422

Your threads update n simultaneously and don't necessarily see the update from the other thread. For example, both update n of the value 1 at exactly the same time. Instead of 3 the value of n only increases to 2. This happens multiple times. Therefore, the value of n is always less than 2000000.

You need to look your global variable:

from threading import Thread, RLock

lock = RLock()

n = 0
def calculate_n(number):
    global n
    for i in range(number):
        with lock:
            n += 1
    print n

def print_n():
    global n
    print "n= "
    print n



if __name__ == '__main__':
    number = 1000000
    t1 = Thread(target=calculate_n, args=(number, ))
    t1.start()
    t2 = Thread(target=calculate_n, args=(number,))
    t2.start()
    t1.join()
    t2.join()
    print_n()

Output:

1991917
2000000
n= 
2000000

This will slow things down a lot. Locking the whole loop makes things much faster:

def calculate_n(number):
    global n
    with lock:
        for i in range(number):
            n += 1
    print n

Due to the GIL threads won't speed up CPU-bound code anyway. So locking the whole loop removes a lot of switching back and force between threads.

Upvotes: 2

Related Questions