Combine multiprocess and multithread in Python

Question

(I don't know if this should be asked here at SO, or one of the other stackexchange)

When doing heav I/O bound tasks e.g API-calls or database fetching, I wonder, if Python only uses one process for multithreading, i.e can we create even more threads by combining multiprocessing and multithreading, like the pseudo-code below

for process in Processes:
    for thread in threads:
        fetch_api_resuls(thread)

or does Python do this automatically?

2e0byo · Accepted Answer

I do not think there would be any point doing this: spinning up a new process has a relatively high cost, and spinning up a new thread has a pretty high cost. Serialising tasks to those threads or processes costs again, and synchronising state costs...again.

What I would do if I had two sets of problems:

I/O bound problems (e.g. fetching data over a network)
CPU bound problems related to those I/O bound problems

is to combine multiprocessing with asyncio. This has a much lower overhead---we only have one thread, and we pay for the scheduler only (but no serialisation), doesn't involve spinning up a gazillion processes (each of which uses around as much virtual memory as the parent process) or threads (each of which still uses a fair chunk of memory).

However, I would not use asyncio within the multiprocessing threads---I'd use asyncio in the main thread, and offload cpu-intensive tasks to a pool of worker threads when needed.

I suspect you probably can use threading inside multiprocessing, but it is very unlikely to bring you any speed boost.

Combine multiprocess and multithread in Python

Answers (1)

Related Questions