Reputation: 11
I am testing several different algorithms (for solving 16x16 sudokus) against each other, measuring their performance using the timeit module. However, it appears that only the first of the timeit.repeat() iterations is actually calculated, because the other iterations are gotten much faster. Testing a single algorithm with t.repeat(repeat=10, number=1) the following results are gotten:
[+] Results for......: solve1 (function 1/1)
[+] Fastest..........: 0.00003099
[+] Slowest..........: 32.38717794
[+] Average*.........: 0.00003335 (avg. calculated w/o min/max values)
The first out of 10 results always takes an much larger time to complete, which seems to be explainable only by the fact that iterations 2 to 10 of the timeit.repeat() loop somehow use the cached results of the loop's previous iterations. When actually using timeit.repeat() in a for loop to compare several algorithms against each other, again it appears that the solution to the puzzle is calculated only once:
[+] Results for......: solve1 (function 1/3)
[+] Fastest..........: 0.00003099
[+] Slowest..........: 16.33443809
[+] Average*.........: 0.00003263 (avg. calculated w/o min/max values)
[+] Results for......: solve2 (function 2/3)
[+] Fastest..........: 0.00365305
[+] Slowest..........: 0.02915907
[+] Average*.........: 0.00647599 (avg. calculated w/o min/max values)
[+] Results for......: solve3 (function 3/3)
[+] Fastest..........: 0.00659299
[+] Slowest..........: 0.02440906
[+] Average*.........: 0.00717765 (avg. calculated w/o min/max values)
The really weird thing is that relative speed (in relation to each other) of the algorithms is consistent throughout measurements, which would indicate that all algorithms are calculating their own results. Is this extreme increase in performance due to the fact that a large part of the intermediate results (gotten when computing the first solution) are still in some sort of cache, reserved by the python proces?
Any help/insights would be greatly appreciated.
Upvotes: 1
Views: 2070
Reputation: 6849
I think the memory allocation is the problem.
the python interpreter itself holds a memory pool, which starts with no (or little?) memory pooling. after the first run of your program, much memory is allocated (from the system) and free (to the pool), and then the following runs get memory from the pool, which is much faster than asking memory from system.
but this makes sense only if your algorithm will consume much memory.
Upvotes: 2