Omid
Omid

Reputation: 2667

Why evaluating boolean objects takes time in python

I compared these two code snippets using the timeit module and realized the second one is slightly faster:

~$ python -m timeit —setup "l=[1, 2];k=1" "l[k==1]"
10000000 loops, best of 3: 0.0414 usec per loop
~$ python -m timeit —setup "l=[1, 2];k=1" "l[0 if k==1 else 1]"
10000000 loops, best of 3: 0.0372 usec per loop

Since the logic is the same I thought evaluating boolean objects takes more time than integer equivalence (True == 1 and False == 0), therefore I came up with the following benchmark and it turns out that I was correct:

~$ python -m timeit —setup "l=range(1000)" "l[False]"
10000000 loops, best of 3: 0.0411 usec per loop
~$ python -m timeit —setup "l=range(1000)" "l[False]"
10000000 loops, best of 3: 0.0394 usec per loop
~$ python -m timeit —setup "l=range(1000)" "l[False]"
10000000 loops, best of 3: 0.0416 usec per loop
~$ python -m timeit —setup "l=range(1000)" "l[True]"
10000000 loops, best of 3: 0.0428 usec per loop
~$ python -m timeit —setup "l=range(1000)" "l[True]"
10000000 loops, best of 3: 0.0394 usec per loop
~$ python -m timeit —setup "l=range(1000)" "l[True]"
10000000 loops, best of 3: 0.0393 usec per loop
~$ 
~$
~$ python -m timeit —setup "l=range(1000)" "l[0]"
10000000 loops, best of 3: 0.0232 usec per loop
~$ python -m timeit —setup "l=range(1000)" "l[0]"
10000000 loops, best of 3: 0.0232 usec per loop
~$ python -m timeit —setup "l=range(1000)" "l[0]"
10000000 loops, best of 3: 0.0232 usec per loop
~$ python -m timeit —setup "l=range(1000)" "l[1]"
10000000 loops, best of 3: 0.0232 usec per loop
~$ python -m timeit —setup "l=range(1000)" "l[1]"
10000000 loops, best of 3: 0.0232 usec per loop
~$ python -m timeit —setup "l=range(1000)" "l[1]"
10000000 loops, best of 3: 0.0232 usec per loop

But I don't know what the underlying reason is. I mean why evaluating True and False takes more time? Also I noticed another mysterious thing during the benchmark. In the first part of the benchmark there is variation in the results while the numbers for the second one is stable.

Upvotes: 4

Views: 1337

Answers (2)

user2357112
user2357112

Reputation: 280554

For l[k==1] and l[0 if k==1 else 1], you didn't time it long enough. The difference you saw is within what you'd get from random variation. I'm not sure which form is ultimately faster, but a longer trial showed the opposite effect:

>>> timeit.timeit('l[k==1]', 'l=[1,2];k=1', number=100000000)
10.782931089401245
>>> timeit.timeit('l[0 if k==1 else 1]', 'l=[1,2];k=1', number=100000000)
11.140317916870117

l[0 if k==1 else 1] is unexpectedly competitive most likely because l[k==1] doesn't hit the fast path for the BINARY_SUBSCR opcode:

TARGET_NOARG(BINARY_SUBSCR)
{
    w = POP();
    v = TOP();
    if (PyList_CheckExact(v) && PyInt_CheckExact(w)) {
        /* INLINE: list[int] */
        Py_ssize_t i = PyInt_AsSsize_t(w);
        if (i < 0)
            i += PyList_GET_SIZE(v);
        if (i >= 0 && i < PyList_GET_SIZE(v)) {
            x = PyList_GET_ITEM(v, i);
            Py_INCREF(x);
        }
        else
            goto slow_get;
    }
    else
      slow_get:
        x = PyObject_GetItem(v, w);

In your second test, there's the additional factor that in Python 2, True is a built-in variable lookup, while 1 is a much faster LOAD_CONST. LOAD_CONST only indexes into the code object's co_consts tuple, while a built-in lookup takes two dict lookups.

Upvotes: 4

Nander Speerstra
Nander Speerstra

Reputation: 1526

The difference between boolean and integer division have been asked earlier. However, the (in)stability of it isn't discussed. Below, my scores:

Python2

~$ python2 -m timeit --setup "l=range(1000)" "l[False]"
10000000 loops, best of 3: 0.0366 usec per loop
~$ python2 -m timeit --setup "l=range(1000)" "l[False]"
10000000 loops, best of 3: 0.0332 usec per loop
~$ python2 -m timeit --setup "l=range(1000)" "l[1]"
10000000 loops, best of 3: 0.0193 usec per loop
~$ python2 -m timeit --setup "l=range(1000)" "l[1]"
100000000 loops, best of 3: 0.0194 usec per loop
~$ python2 -m timeit --setup "l=range(1000)" "l[1]"
100000000 loops, best of 3: 0.0195 usec per loop
~$ python2 -m timeit --setup "l=range(1000)" "l[0]"
100000000 loops, best of 3: 0.0196 usec per loop

Python 3

~$ python -m timeit --setup "l=range(1000)" "l[0]"
10000000 loops, best of 3: 0.0712 usec per loop
~$ python -m timeit --setup "l=range(1000)" "l[0]"
10000000 loops, best of 3: 0.072 usec per loop
~$ python -m timeit --setup "l=range(1000)" "l[0]"
10000000 loops, best of 3: 0.0719 usec per loop
~$ python -m timeit --setup "l=range(1000)" "l[False]"
10000000 loops, best of 3: 0.082 usec per loop
~$ python -m timeit --setup "l=range(1000)" "l[False]"
10000000 loops, best of 3: 0.0821 usec per loop

Funny thing is: my score alter not only between Python versions, but also within Python versions. Due to cache misses, differences are logical. Funny thing at your scores is that the differences at the 0 and 1 are so small, you cannot see it with 4 decimals... (I'm using a virtual machine, so this may slow my system enough to make it easy to see the difference)

Upvotes: 0

Related Questions