Reputation: 166
Recently,I tried to use asyncio to execute multiple blocking operations asynchronously.I used the function loop.run_in_executor
,It seems that the function puts tasks into the thread pool.As far as I know about thread pool,it reduces the overhead of creating and destroying threads,because it can put in a new task when a task is finished instead of destroying the thread.I wrote the following code for deeper unstanding.
def blocking_funa():
print('starta')
print('starta')
time.sleep(4)
print('enda')
def blocking_funb():
print('startb')
print('startb')
time.sleep(4)
print('endb')
loop = asyncio.get_event_loop()
tasks = [loop.run_in_executor(None, blocking_funa), loop.run_in_executor(None, blocking_funb)]
loop.run_until_complete(asyncio.wait(tasks))
and the output:
starta
startbstarta
startb
(wait for about 4s)
enda
endb
we can see these two tasks are almost simultaneous.now I use threading module:
threads = [threading.Thread(target = blocking_ioa), threading.Thread(target = blocking_iob)]
for thread in threads:
thread.start()
thread.join()
and the output:
starta
starta
enda
startb
startb
endb
Due to the GIL limitation, only one thread is executing at the same time,so I understand the output.But how does thread pool executor make these two tasks almost simultaneous.What is the different between thread pool and thread?And Why does thread pool look like it's not limited by GIL?
Upvotes: 1
Views: 607
Reputation: 31319
You're not making a fair comparison, since you're joining the first thread before starting the second.
Instead, consider:
import time
import threading
def blocking_funa():
print('a 1')
time.sleep(1)
print('a 2')
time.sleep(1)
print('enda (quick)')
def blocking_funb():
print('b 1')
time.sleep(1)
print('b 2')
time.sleep(4)
print('endb (a few seconds after enda)')
threads = [threading.Thread(target=blocking_funa), threading.Thread(target=blocking_funb)]
for thread in threads:
thread.start()
for thread in threads:
thread.join()
The output:
a 1
b 1
b 2
a 2
enda (quick)
endb (a few seconds after enda)
Considering it hardly takes any time to run a print statement, you shouldn't read too much into the prints in the first example getting mixed up.
If you run the code repeatedly, you may find that b 2
and a 2
will change order more or less randomly. Note how in my posted result, b 2
occurred before a 2
.
Also, regarding your remark "Due to the GIL limitation, only one thread is executing at the same time" - you're right that the "execution of any Python bytecode requires acquiring the interpreter lock. This prevents deadlocks (as there is only one lock) and doesn’t introduce much performance overhead. But it effectively makes any CPU-bound Python program single-threaded." https://realpython.com/python-gil/#the-impact-on-multi-threaded-python-programs
The important part there is "CPU-bound" - of course you would still benefit from making I/O-bound code multi-threaded.
Upvotes: 1
Reputation: 77337
Python does a release/acquire on the GIL often. This means that runnable GIL controlled threads will all get little sprints. Its not parallel, just interleaved. More important for your example, python tends to release the GIL when doing a blocking operation. The GIL is released before sleep
and also when print
enters the C libraries.
Upvotes: 0