andytaylor823
andytaylor823

Reputation: 51

Showing tqdm progress bar while using Python multiprocessing

I am trying to run some computationally heavy task using Python's multiprocessing library, and I would like to show a tqdm progress bar for each worker. Specifically, I would prefer to have this functionality for multiprocessing.Process workers or multiprocessing.Pool workers.

I am aware of the similar StackOverflow questions about this topic (see e.g. (1) Multiprocessing : use tqdm to display a progress bar, (2) Show the progress of a Python multiprocessing pool imap_unordered call?, (3) tqdm progress bar and multiprocessing ) but they all seem interested on showing one progress bar across all workers. I would like to show a progress bar for each worker.

Here is an example function, taking place of my computationally expensive function I would like to multiprocess:

from tqdm import notebook
import time
def foo2(id):
    total = 100
    with notebook.tqdm(total=total, position=id) as pbar:
        for _ in range(0, total, 5):
            pbar.update(5)
            time.sleep(0.1)

When I try this sequentially, I get the expected results: 5 progress bars filling up one after the other.

However, when I try to do this with multiprocessing, I get the desired speed-up, but no progress bars are displayed. This is true whether I use Pool workers or Process workers. Here is my sample code:

%%time
from multiprocessing import Pool
pool = Pool(5)
pool.map(foo2, range(5))
pool.close()
pool.join()

Pool - no progress bars

Per the comments here (https://github.com/tqdm/tqdm/issues/407#issuecomment-322932800), I tried using several ThreadPool workers, and this strangely was able to produce the progress bars. However, for my situation, I would prefer to use Pool or Process workers with progress bars.

%%time
from multiprocessing.pool import ThreadPool
pool = ThreadPool(5)
pool.map(foo2, range(5))
pool.close()
pool.join()

ThreadPool - progress bars show!

Hopefully someone can help me with this. I have tried just about everything I could think of. For reference, I am using Python 3.7.7 and tqdm 4.57.0.

Upvotes: 1

Views: 9708

Answers (2)

padu
padu

Reputation: 889

I don't know if the question is still relevant, but you can easily do it using the parallelbar library. Here is a simple example

#pip install parallelbar
from parallelbar import progress_map
from math import radians, sin, cos

# toys example of computationally expensive function 
def cpu_bench(number):
    product = 1.0
    for elem in range(number):
        angle = radians(elem)
        product *= sin(angle)**2 + cos(angle)**2
    return product

if __name__=='__main__':
    tasks = [1000000 + i for i in range(100)]
    result = progress_map(cpu_bench, tasks, n_cpu=4, chunk_size=1, core_progress=True)

Upvotes: 2

andytaylor823
andytaylor823

Reputation: 51

Searching the issues posts on the main github page for tqdm, I found a hack that works for me, but it definitely feels like a "hack" instead of a true issue fix: https://github.com/tqdm/tqdm/issues/485#issuecomment-473338308

The new (working) code looks like:

from tqdm import notebook
import time
def foo2(id):
    total = 100
    print(' ', end='', flush=True)
    for _ in notebook.tqdm(range(0, total, 5)):
        time.sleep(0.1)

plus

%%time
pool = Pool(5)
#pool.map(foo2, range(5)) # this also works fine with the new hack
for i in range(5):
    pool.apply_async(foo2, args=(i,))
pool.close()
pool.join()

Upvotes: 1

Related Questions