Why int(len(str(k))) is faster than a loop (while)

Question

I am wondering why this function:

def digit(k):
    return len(str(k))

is faster than this one ? :

def digit(k):
    i = 0
    while k != 0:
        k = k // 10
        i += 1
    return i

And why it's the opposite for example in C?

zwol · Accepted Answer

Let's look at what happens if we take your Python code and translate it as literally as possible to C. We can do that very easily with Cython:

# save this in a file named "testmod.pyx" and compile it with Cython and a
# C compiler - details vary depending on OS and Python installation

from libc.stdio cimport snprintf
from libc.string cimport strlen

def c_digit_loop(k_):
    cdef unsigned int k = k_
    cdef int i = 0
    while k != 0:
        k = k // 10
        i += 1
    return i

def c_digit_str(k_):
    cdef unsigned int k = k_
    cdef char strbuf[32] # more than enough for any 'unsigned int'
    snprintf(strbuf, sizeof(strbuf), "%u", k);
    return strlen(strbuf);

The machine code you get from this is not as optimal as it could be, but it's close enough for a quick test. This allows us to compare performance directly using timeit, like this:

# save this in a file named 'test.py' and run it using the
# same CPython you compiled testmod.pyx against

import timeit
from testmod import c_digit_loop, c_digit_str

def py_digit_loop(k):
    i = 0
    while k != 0:
        k = k // 10
        i += 1
    return i

def py_digit_str(k):
    return len(str(k))

def test1(name):
    print(name, timeit.timeit(name+"(1234567)", "from __main__ import "+name,
                              number=10000))

test1("py_digit_loop")
test1("py_digit_str")
test1("c_digit_str")
test1("c_digit_loop")

When I run this program, this is the output I get on the computer where I'm typing this. I've manually lined up the numbers to make them easier to compare by eye.

py_digit_loop 0.004024484000183293
py_digit_str  0.0020454510013223626
c_digit_str   0.0009924650003085844
c_digit_loop  0.00025072999960684683

So that confirms your original assertion: the loop is slower than converting to a string in Python, but in C it's the other way around. But notice that converting to a string in C is still faster than converting to a string in Python.

To know precisely why this is happening we would need to dig deeper into the guts of the Python interpreter than I feel like doing this morning, but I know enough about its guts already to tell you in outline. The CPython interpreter is not very efficient. Even operations on small integers involve reference counting and construction of scratch objects on the heap. Your loop that does basic arithmetic in Python requires one or two scratch objects per iteration (depending on whether 0, 1, 2, ... are "interned"). Doing the calculation by converting to a string and taking its length involves creating only one temporary object, the string, for the whole calculation. The bookkeeping involved with these scratch objects dwarfs the cost of the actual calculation, for both of the Python implementations.

The C string-based implementation performs almost exactly the same steps that the Python string-based implementation performs, but its scratch object is a char array on the stack, not a full-fledged Python string object, and that all by itself is apparently good for a 40-50% speedup.

The C loop-based implementation compiles down to eight machine instructions for the actual loop. No memory accesses. Not even a hardware divide instruction (that's the magic of strength reduction). And then hundreds more instructions dealing with the Python object model. Most of those 0.00025 seconds are still overhead.

Why int(len(str(k))) is faster than a loop (while)

Answers (1)

Related Questions