jonka
jonka

Reputation: 35

Show timeit progress

I have multiple functions I repeatedly want to measure execution time for using the builtin timeit library. (Say fun1, fun2 and fun3 are all depending on a couple subroutines, some of which I am trying to optimize. After every iteration, I want to know how fast my 3 top-level functions are executing)

The thing is, I am not sure in advance how long the functions are going to run, I just have a rough estimate. Using timeit.repeat(...) with a sufficient amount of repetitions/number of execution gives me a good estimate, but sometimes it takes very long because I accidentally slowed down one of the subroutines. It would be very handy to have a tqdm-like progress bar for the timing routine so I can estimate in advance for how long I have to wait until timing is done. I did not find any such feature in the timeit library, so here is the question:

Is it possible to show a (tqdm-like) progress bar when timing functions using timeit.repeat or timeit.timeit?

Upvotes: 2

Views: 1388

Answers (2)

James
James

Reputation: 36746

You can create your subclass of timeit.Timer that uses tqdm to track the total iterations performed.

from timeit import Timer, default_number
from tqdm import tqdm
import itertools
import gc

class ProgressTimer(Timer):
    def timeit(self, number=default_number):
        """Time 'number' executions of the main statement.
        To be precise, this executes the setup statement once, and
        then returns the time it takes to execute the main statement
        a number of times, as a float measured in seconds.  The
        argument is the number of times through the loop, defaulting
        to one million.  The main statement, the setup statement and
        the timer function to be used are passed to the constructor.
        """
        # wrap the iterator in tqdm
        it = tqdm(itertools.repeat(None, number), total=number)
        gcold = gc.isenabled()
        gc.disable()
        try:
            timing = self.inner(it, self.timer)
        finally:
            if gcold:
                gc.enable()
        # the tqdm bar sometimes doesn't flush on short timers, so print an empty line
        print()
        return timing

To use this object, we just need to pass in the script we want to run. You can either define it as a string (like below) or you can simply open the file for reading and read to a variable.

py_setup = 'import numpy as np'

py_script = """
x = np.random.rand(1000)
x.sum()
"""

pt = ProgressTimer(py_script, setup=py_setup)
pt.timeit()

# prints / returns:
100%|███████████████████████████████████████████████| 1000000/1000000 [00:13<00:00, 76749.68it/s]
13.02982600001269

Upvotes: 3

Wups
Wups

Reputation: 2569

Looking at the source code of timeit, there is a template that gets executed when any timing is done. One could simply change that template to include a progress indicator:

import timeit

timeit.template = """
def inner(_it, _timer{init}):
    from tqdm import tqdm
    {setup}
    _t0 = _timer()
    for _i in tqdm(_it, total=_it.__length_hint__()):
        {stmt}
    _t1 = _timer()
    return _t1 - _t0
"""

# some timeit test:
timeit.timeit(lambda: "-".join(map(str, range(100))), number=1000000)

Of course, this will influence the result, because the tqdm-calls are inside the _t0 and _t1 measurements. tqdm's documentation claims, that the overhead is only 60ns per iteration, though.

Upvotes: 4

Related Questions