user22490234
user22490234

Reputation: 315

Why is calling float() on a number slower than adding 0.0 in Python?

What is the reason that casting an integer to a float is slower than adding 0.0 to that int in Python?

import timeit


def add_simple():
    for i in range(1000):
        a = 1 + 0.0


def cast_simple():
    for i in range(1000):
        a = float(1)


def add_total():
    total = 0
    for i in range(1000):
        total += 1 + 0.0


def cast_total():
    total = 0
    for i in range(1000):
        total += float(1)


print "Add simple timing: %s" % timeit.timeit(add_simple, number=1)
print "Cast simple timing: %s" % timeit.timeit(cast_simple, number=1)
print "Add total timing: %s" % timeit.timeit(add_total, number=1)
print "Cast total timing: %s" % timeit.timeit(cast_total, number=1)

The output of which is:

Add simple timing: 0.0001220703125

Cast simple timing: 0.000469923019409

Add total timing: 0.000164985656738

Cast total timing: 0.00040078163147

Upvotes: 19

Views: 2282

Answers (5)

chepner
chepner

Reputation: 531135

Simply speaking, you aren't casting anything. A type cast tells the compiler to treat the value in a variable as if it had a different type; the same underlying bits are used. Python's float(1), however, constructs a new object in memory distinct from the argument to float.

When you add 1 + 0.0, this simply calls (1).__add__(0.0), and the __add__ method of the builtin int class knows how to deal with float objects. No additional objects (aside from the return value) need to be constructed.

Newer versions of Python have an optimizer so that constant expressions like 1 + 0.0 can be replaced at compile time with 1.0; no functions need to be executed at run-time at all. Replace 1 + 0.0 with x = 1 (prior to the loop) and x + 0.0 to force int.__add__ to be called at runtime to observe the difference. It will be slower than 1 + 0.0, but still faster than float(1).

Upvotes: 2

user2357112
user2357112

Reputation: 280564

If you look at the bytecode for add_simple:

>>> dis.dis(add_simple)
  2           0 SETUP_LOOP              26 (to 29)
              3 LOAD_GLOBAL              0 (range)
              6 LOAD_CONST               1 (1000)
              9 CALL_FUNCTION            1
             12 GET_ITER
        >>   13 FOR_ITER                12 (to 28)
             16 STORE_FAST               0 (i)

  3          19 LOAD_CONST               4 (1.0)
             22 STORE_FAST               1 (a)
             25 JUMP_ABSOLUTE           13
        >>   28 POP_BLOCK
        >>   29 LOAD_CONST               0 (None)
             32 RETURN_VALUE

You'll see that 0.0 isn't actually anywhere in there. It just loads the constant 1.0 and stores it to a. Python computed the result at compile-time, so you're not actually timing the addition.

If you use a variable for 1, so Python's primitive peephole optimizer can't do the addition at compile-time, adding 0.0 still has a lead:

>>> timeit.timeit('float(a)', 'a=1')
0.22538208961486816
>>> timeit.timeit('a+0.0', 'a=1')
0.13347005844116211

Calling float requires two dict lookups to figure out what float is, one in the module's global namespace and one in the built-ins. It also has Python function call overhead, which is more expensive than a C function call.

Adding 0.0 only requires indexing into the function's code object's co_consts to load the constant 0.0, and then calling the C-level nb_add functions of the int and float types to perform the addition. This is a lower amount of overhead overall.

Upvotes: 11

Wayne Werner
Wayne Werner

Reputation: 51807

If you use the dis module, you can start to see why:

In [11]: dis.dis(add_simple)
  2           0 SETUP_LOOP              26 (to 29)
              3 LOAD_GLOBAL              0 (range)
              6 LOAD_CONST               1 (1000)
              9 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             12 GET_ITER
        >>   13 FOR_ITER                12 (to 28)
             16 STORE_FAST               0 (i)

  3          19 LOAD_CONST               4 (1.0)
             22 STORE_FAST               1 (a)
             25 JUMP_ABSOLUTE           13
        >>   28 POP_BLOCK
        >>   29 LOAD_CONST               0 (None)
             32 RETURN_VALUE

In [12]: dis.dis(cast_simple)
  2           0 SETUP_LOOP              32 (to 35)
              3 LOAD_GLOBAL              0 (range)
              6 LOAD_CONST               1 (1000)
              9 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             12 GET_ITER
        >>   13 FOR_ITER                18 (to 34)
             16 STORE_FAST               0 (i)

  3          19 LOAD_GLOBAL              1 (float)
             22 LOAD_CONST               2 (1)
             25 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             28 STORE_FAST               1 (a)
             31 JUMP_ABSOLUTE           13
        >>   34 POP_BLOCK
        >>   35 LOAD_CONST               0 (None)
             38 RETURN_VALUE

Note the CALL_FUNCTION

Function calls in Python are (relatively) slow. As are . lookups. Because casting to float requires a function call - that's why it's slower.

Upvotes: 16

Aleksi Torhamo
Aleksi Torhamo

Reputation: 6632

If you're using Python 3 or a recent version of Python 2 (2.5 or higher), it does constant folding at bytecode generation time. This means that 1 + 0.0 is replaced with 1.0 before the code is executed.

Upvotes: 7

nod
nod

Reputation: 301

Doing the addition can be done in C. Doing the cast causes a function to be called and then the C lib to kick in. There's overhead for that function call.

Upvotes: 3

Related Questions