Reputation: 51
I am trying to run some computationally heavy task using Python's multiprocessing
library, and I would like to show a tqdm progress bar for each worker. Specifically, I would prefer to have this functionality for multiprocessing.Process
workers or multiprocessing.Pool
workers.
I am aware of the similar StackOverflow questions about this topic (see e.g. (1) Multiprocessing : use tqdm to display a progress bar, (2) Show the progress of a Python multiprocessing pool imap_unordered call?, (3) tqdm progress bar and multiprocessing ) but they all seem interested on showing one progress bar across all workers. I would like to show a progress bar for each worker.
Here is an example function, taking place of my computationally expensive function I would like to multiprocess:
from tqdm import notebook
import time
def foo2(id):
total = 100
with notebook.tqdm(total=total, position=id) as pbar:
for _ in range(0, total, 5):
pbar.update(5)
time.sleep(0.1)
When I try this sequentially, I get the expected results: 5 progress bars filling up one after the other.
However, when I try to do this with multiprocessing
, I get the desired speed-up, but no progress bars are displayed. This is true whether I use Pool
workers or Process
workers. Here is my sample code:
%%time
from multiprocessing import Pool
pool = Pool(5)
pool.map(foo2, range(5))
pool.close()
pool.join()
Per the comments here (https://github.com/tqdm/tqdm/issues/407#issuecomment-322932800), I tried using several ThreadPool
workers, and this strangely was able to produce the progress bars. However, for my situation, I would prefer to use Pool
or Process
workers with progress bars.
%%time
from multiprocessing.pool import ThreadPool
pool = ThreadPool(5)
pool.map(foo2, range(5))
pool.close()
pool.join()
ThreadPool - progress bars show!
Hopefully someone can help me with this. I have tried just about everything I could think of. For reference, I am using Python 3.7.7
and tqdm 4.57.0
.
Upvotes: 1
Views: 9708
Reputation: 889
I don't know if the question is still relevant, but you can easily do it using the parallelbar library. Here is a simple example
#pip install parallelbar
from parallelbar import progress_map
from math import radians, sin, cos
# toys example of computationally expensive function
def cpu_bench(number):
product = 1.0
for elem in range(number):
angle = radians(elem)
product *= sin(angle)**2 + cos(angle)**2
return product
if __name__=='__main__':
tasks = [1000000 + i for i in range(100)]
result = progress_map(cpu_bench, tasks, n_cpu=4, chunk_size=1, core_progress=True)
Upvotes: 2
Reputation: 51
Searching the issues posts on the main github page for tqdm
, I found a hack that works for me, but it definitely feels like a "hack" instead of a true issue fix: https://github.com/tqdm/tqdm/issues/485#issuecomment-473338308
The new (working) code looks like:
from tqdm import notebook
import time
def foo2(id):
total = 100
print(' ', end='', flush=True)
for _ in notebook.tqdm(range(0, total, 5)):
time.sleep(0.1)
plus
%%time
pool = Pool(5)
#pool.map(foo2, range(5)) # this also works fine with the new hack
for i in range(5):
pool.apply_async(foo2, args=(i,))
pool.close()
pool.join()
Upvotes: 1