Reputation: 385
Question:
I am trying to gain a better understanding of Python's multiprocessing and multithreading, particularly in the context of using the concurrent.futures
module. I want to make sure my understanding is correct.
Multiprocessing:
I believe that multiprocessing creates multiple processes, each with its own Global Interpreter Lock (GIL) and memory space. These processes can efficiently utilize multiple CPU cores. And, each independent process automatically manages its own threads.
Here is an example of using a ProcessPoolExecutor
:
# Create a ProcessPoolExecutor with a maximum of 4 concurrent processes
with concurrent.futures.ProcessPoolExecutor(max_processes) as executor:
# Use the executor to map your function to the list of numbers
results = list(executor.map(calculate_square, numbers))
Multithreading:
In contrast, multithreading creates multiple threads within a single process. These threads share the same memory space and GIL. Multithreading is typically more suitable for I/O-bound tasks rather than CPU-bound tasks. However, it doesn't utilize multiple CPU cores, so even if you have 4 cores a program only uses one core where it runs x number of threads.
Here is an example of using a ThreadPoolExecutor
:
# Create a ThreadPoolExecutor with a specified maximum number of threads
with concurrent.futures.ThreadPoolExecutor(max_threads) as executor:
# Use the executor to map your function to the list of numbers
results = list(executor.map(calculate_square, numbers))
hybrid
It’s possible to combine the concurrent.futures thread and process pool to handle both I/O and CPU intensive tasks by creating multiple processes with its own threads.
Upvotes: 0
Views: 869
Reputation: 3490
For IO-sensitive programs, using multithreading alone is sufficient, and it could utilize multiple cores. The GIL will be released around potentially blocking I/O operations like reading or writing a file.
If the workload involves a hybrid of CPU and IO tasks, you might consider to use the combination of multiple process and threading.
def run_child():
"run in child process"
executor = ThreadPoolExecutor()
# create a thread pool to run tasks
# ...
def main():
pool = Pool() # multiprocessing
pool.apply_async(run_child)
Furthermore, for IO-sensitive tasks, coroutine might be a better option, also you could consider the combination multiple process and coroutine.
Upvotes: 1