uba2012
uba2012

Reputation: 385

Python - multiprocessing and multithreading

Question:

I am trying to gain a better understanding of Python's multiprocessing and multithreading, particularly in the context of using the concurrent.futures module. I want to make sure my understanding is correct.

Multiprocessing:

I believe that multiprocessing creates multiple processes, each with its own Global Interpreter Lock (GIL) and memory space. These processes can efficiently utilize multiple CPU cores. And, each independent process automatically manages its own threads.

Here is an example of using a ProcessPoolExecutor:

# Create a ProcessPoolExecutor with a maximum of 4 concurrent processes
with concurrent.futures.ProcessPoolExecutor(max_processes) as executor:
    # Use the executor to map your function to the list of numbers
    results = list(executor.map(calculate_square, numbers))

Multithreading:

In contrast, multithreading creates multiple threads within a single process. These threads share the same memory space and GIL. Multithreading is typically more suitable for I/O-bound tasks rather than CPU-bound tasks. However, it doesn't utilize multiple CPU cores, so even if you have 4 cores a program only uses one core where it runs x number of threads.

Here is an example of using a ThreadPoolExecutor:

# Create a ThreadPoolExecutor with a specified maximum number of threads
with concurrent.futures.ThreadPoolExecutor(max_threads) as executor:
    # Use the executor to map your function to the list of numbers
    results = list(executor.map(calculate_square, numbers))

hybrid

It’s possible to combine the concurrent.futures thread and process pool to handle both I/O and CPU intensive tasks by creating multiple processes with its own threads.

Upvotes: 0

Views: 869

Answers (1)

Jacky Wang
Jacky Wang

Reputation: 3490

For IO-sensitive programs, using multithreading alone is sufficient, and it could utilize multiple cores. The GIL will be released around potentially blocking I/O operations like reading or writing a file.

If the workload involves a hybrid of CPU and IO tasks, you might consider to use the combination of multiple process and threading.

def run_child():
    "run in child process"
    executor = ThreadPoolExecutor()
    # create a thread pool to run tasks
    # ...

def main():
    pool = Pool()  # multiprocessing
    pool.apply_async(run_child)
    

Furthermore, for IO-sensitive tasks, coroutine might be a better option, also you could consider the combination multiple process and coroutine.

Upvotes: 1

Related Questions