Reputation: 18
I have a large dataset I need to run a few short operations over repeatedly in Python. As the dataset grows, the runtime of the entire process grows quadratically. Obviously, there are ways to batch these operations so that they run faster, but I want to measure which operations are taking the most compute time before I try to optimize them.
However, when I try to measure the runtime of the operations, I am getting inconsistent results. In other words, the sum of the runtimes of each operation is not close to the total runtime. I've constructed a simple example.
import time
import tqdm
all_times = {"Mult":0,"Add":0,"Full":0}
s1 = time.process_time()
for i in tqdm.tqdm(range(int(1e4))):
for j in range(i):
s = time.process_time()
z = i*i
all_times["Mult"] += (time.process_time() - s)
s = time.process_time()
z = i+i
all_times["Add"] += (time.process_time() - s)
all_times["Full"] += (time.process_time() - s1)
print(all_times)
Resulting in something like {'Mult': 17.977712000036018, 'Add': 17.74207199997629, 'Full': 69.089015}
where my real-world clock measured time is also ~70 seconds.
The only things not being measured in "Mult" and "Add" are the timing operations themselves, and the incrementation of i
and j
. If I remove the inner timing operations, the runtime of the process as a whole is not really affected. Surely, incrementing these iterators cannot be that expensive.
How do I measure these shorter operations more accurately?
Upvotes: 0
Views: 231
Reputation: 20798
First, as you add code for timing in your code, you're changing what you're observing.
You can use timeit
https://docs.python.org/3/library/timeit.html to check on timing of a piece of code.
Or to see the execution for each function, you should use cProfile
python -m cProfile mycode.py
You can even save a profile file and then analyze it to generate a FlameGraph https://www.brendangregg.com/flamegraphs.html
To do that, save the profile like:
python -m cProfile -o profile.prof mycode.py
then use flameprof
https://github.com/baverman/flameprof
pip install flameprof
Then
python flameprof.py profile.prof > output.svg
Open the SVG in your browser to see the Flame graph
Upvotes: 1