Reputation: 11
In below code if I comment 'line 6' and uncomment 'line 3,4,5'. Code runs in main thread and time taken is 17 sec.
Now if I uncomment 'line 6' and comment 'line 3,4,5'. And uncomment 'line 1' and comment 'line 2' then code runs in multi thread (3 threads are created) time taken is again 17 sec. So multithreading and single thread is taking same time.
Now if I uncomment 'line 6' and comment 'line 3,4,5'. And comment 'line 1' and uncomment 'line 2' then code runs in multi process (3 process are created) time taken is 9 sec. Multiprocessing is working faster as compared to single and multi treading.
Please let us know why is multi thread performance is same as the single thread ?
import threading
import time
import multiprocessing
import multiprocessing.pool
class xyz:
def __init__(self):
self.data = []
pass
def loopData(self):
item = range(0, 100000000)
for i in item:
self.data.append(i)
def loopDatamulti(self):
num_of_thread = 3
thread_list = []
while True:
t = threading.Thread(target=self.loopData, args=()) #line 1
#t = multiprocessing.Process(target=self.loopData,args=()) #line 2
thread_list.append(t)
num_of_thread_created = len(thread_list)
if num_of_thread_created == num_of_thread:
break
for t in thread_list:
t.start()
for t in thread_list:
t.join()
def main():
print("Start")
start_time = time.time()
xa = xyz()
#xa.loopData() #line 3
#xa.loopData() #line 4
#xa.loopData() #line 5
xa.loopDatamulti() #line 6
end_time = time.time()
strLog = "total time {}".format(end_time - start_time)
print(strLog)
if __name__ == "__main__":
main()
Upvotes: 0
Views: 2806
Reputation:
Multi-threading is not well suited to the execution of functions that are purely CPU intensive. If such functions never yield the CPU (e.g. for some kind of I/O) they will just "lock down" a single CPU and you'll gain no benefits. This is where multi-processing comes into play. Even then, you need to be careful because if your function is short-lived then the overhead of creating a separate process may outweigh the advantages that you might otherwise expect. Here's an example of multi-processing. Play with the variables ITERS and PROCS to see how the behaviour changes and you'll get the point. The function (myFunc) just carries out an arbitrary pseudo random calculation and build a list to return.
from datetime import datetime
from multiprocessing import Pool
import math
import random
ITERS = 100_000
PROCS = 100
def myFunc(r):
return [(math.sqrt(random.randint(1, 2000))**2)**(1 / 3) for _ in range(r)]
if __name__ == '__main__':
_start = datetime.now()
with Pool() as pool:
for p in [pool.apply_async(func=myFunc, args=(ITERS,))
for _ in range(PROCS)]:
p.wait()
_end = datetime.now()
print(f'Multi-processing duration={_end-_start}')
_start = datetime.now()
for I in range(PROCS):
myFunc(ITERS)
_end = datetime.now()
print(f'Single-threaded duration={_end-_start}')
On my machine and with the values of ITERS and PROCS as shown in the code, the output is as follows:-
Multi-processing duration=0:00:01.526478
Single-threaded duration=0:00:09.776963
Upvotes: 1
Reputation: 81
In python there is something called Global Interpreter Lock, which makes sure that only one part of the code is run at the time (e.g. a function locks the interpreter and the other code has to wait on it). The reason for that has to do with how python releases unused variables (by counting the number of references to each variable).
You can read up on it here: https://realpython.com/python-gil/
Multiprocessing solves this issue by creating OS level processes in order to actually parallelize your code. By doing this, you get multiple processes, each with it's own python interpreter and therefore it's own Global Interpreter Lock.
Upvotes: 0