Tuple comprehension time profiling

Question

lately I've been doing some time profiling of my scripts. Wondering about tuple comprehension I've found several threads on SO pointing out two ways of doing :

list comprehension + tuple()

>>> tuple([i for i in xrange(1000000)])

tuple comprehension
```
>>> tuple(i for i in xrange(1000000))
```

I'm puzzled by the fact that cProlile and timeit tell me that the first method is faster that the second on and that the command line time and kernprof line profiler say the contrary.

Here is what I get:

>>> import cProfile
>>> cProfile.run('tuple([i for i in xrange(1000000)])')
1000003 function calls in 0.139 seconds
>>> cProfile.run('tuple(i for i in xrange(1000000))')
1000003 function calls in 0.478 seconds 

>>> import timeit
>>> timeit.timeit('tuple([i for i in xrange(1000000)])')
0.08100390434265137
>>> timeit.timeit('tuple(i for i in xrange(1000000))')
0.08400511741638184

With test_tuple_list.py:

tuple([i for i in xrange(1000000)])

And test_tuple_generator.py:

tuple(i for i in xrange(1000000))

I get:

$time python test_tuple_list.py
real 0m0.398s
user 0m0.171s
sys 0m0.202s

$time python test_tuple_generator.py
real 0m0.333s
user 0m0.109s
sys 0m0.234s

With test_tuple_list_kernprof.py

@profile
def test():
    tuple([i for i in xrange(1000000)])
test()

And test_tuple_generator_kernprof.py:

@profile
def test():
   tuple(i for i in xrange(1000000))
test()

I get:

$kernprof.py -lv test_tuple_list_kernprof.py
Total time: 0.861045 s

$kernprof.py -lv test_tuple_generator_kernprof.py
Total time: 0.444025 s

I assume that the difference I get between these profiler comes with the way they profile, but how comes they contradict one another ?

Thank you

Martijn Pieters · Accepted Answer

Do not use a profiler to measure overall timing differences between two snippets of Python code. A profile severely impacts code execution times across the interpreter, as different code paths will trigger the sys.set_trace() trace function at different times, and the trace function itself could introduce subtle timing differences for different events that completely skew the results making your data useless for absolute timing comparisons.

When profiling, you are as much measuring how the the profiler reacts to different code paths as that you are measuring the code paths themselves. That's fine for when you want to pinpoint where in complex code all your execution time goes, but it is terrible for comparing two different pieces of code purely on how fast they perform.

That leaves just your timeit results, which are too close to call. Both methods are about as fast as one another.

Tuple comprehension time profiling

Answers (1)

Related Questions