Reputation: 135
I am trying to write faster python code using multiple threads. I don't wanna use ProcessPoolExecutor
for memory consumption preserving.
So, When I use ThreadPoolExecutor.map
which maps an iterator to a function. It's really not working the time taken by the code is not really enhanced. Then I read about GIL in python, my question is: If GIL is applied to all, why they create ThreadPoolExecutor.map
, or is there any better idea to use ThreadPoolExecutor.map
.
Please find an example that I use to understand this process. I am happy to hear suggestions in how to use multiple threads for iterator and not multiple process, since my memory card is and cpus are not that high.
import concurrent.futures
import math
PRIMES = [
112272535095293,
112582705942171,
112272535095293,
115280095190773,
115797848077099,
1099726899285419]
def is_prime(n):
if n < 2:
return False
if n == 2:
return True
if n % 2 == 0:
return False
sqrt_n = int(math.floor(math.sqrt(n)))
for i in range(3, sqrt_n + 1, 2):
if n % i == 0:
return False
return True
def main():
with concurrent.futures.ThreadPoolExecutor(max_workers = 4) as executor:
future= executor.map(is_prime, PRIMES)
print(future.result())
if __name__ == '__main__':
main()
Upvotes: 1
Views: 6353
Reputation: 9953
As you noted, the GIL
means that only 1 CPU can be used at a time and there's only 1 "real" thread. So using a thread pool will not speed up CPU-bound code. However, many of the functions in the standard library and other 3rd party libraries will voluntarily release the GIL
if they get blocked on I/O.
For example, if you have code trying to download lots of things from the internet you could you a Python thread pool. When one thread makes a request to download something and is waiting for the response it can (depending on what library you use to make your HTTP requests) release the GIL
and give another thread a chance to run. That means you could make many, many requests before any of them complete and then each Python thread will be woken up again when their response is available. That kind of Python multi-threading can result in a huge performance improvement.
Upvotes: 4