user2236076
user2236076

Reputation: 154

Python: Efficient way to call inbuilt function multiple times?

I have a code that looks something like this:

def somefunction(somelist):
    for item in somelist:
        if len(item) > 10:
            do something
        elif len(item) > 6:
            do something
        elif len(item) > 3:
            do something
        else:
            do something

Since I am calling len(item) multiple times, is it inefficient to do it this way? Would it be preferable to write the code as follows, or are they EXACTLY the same in performance?

def somefunction(somelist):
    for item in somelist:
        x = len(item)
        if x > 10:
            do something
        elif x > 6:
            do something
        elif x > 3:
            do something
        else:
            do something

Upvotes: 1

Views: 1407

Answers (5)

Ashwini Chaudhary
Ashwini Chaudhary

Reputation: 250931

The second approach is surely better, as the number of calls to len() are reduced:

In [16]: import dis

In [18]: lis=["a"*10000,"b"*10000,"c"*10000]*1000

In [19]: def first():
    for item in lis:
        if len(item)<100:
            pass
        elif 100<len(item)<200:
            pass
        elif 300<len(item)<400:
            pass
   ....:         

In [20]: def second():
    for item in lis:
        x=len(item)
        if x<100:
                pass
        elif 100<x<200:
                pass
        elif 300<x<400:
                pass
   ....:         

You can always time your code using timeit module:

In [21]: %timeit first()
100 loops, best of 3: 2.03 ms per loop

In [22]: %timeit second()
1000 loops, best of 3: 1.66 ms per loop

Use dis.dis() to see disassembling of Python byte code into mnemonics

In [24]: dis.dis(first)
  2           0 SETUP_LOOP             109 (to 112)
              3 LOAD_GLOBAL              0 (lis)
              6 GET_ITER            
        >>    7 FOR_ITER               101 (to 111)
             10 STORE_FAST               0 (item)

  3          13 LOAD_GLOBAL              1 (len)
             16 LOAD_FAST                0 (item)
             19 CALL_FUNCTION            1
             22 LOAD_CONST               1 (100)
             25 COMPARE_OP               0 (<)
             28 POP_JUMP_IF_FALSE       34

  4          31 JUMP_ABSOLUTE            7

  5     >>   34 LOAD_CONST               1 (100)
             37 LOAD_GLOBAL              1 (len)
             40 LOAD_FAST                0 (item)
             43 CALL_FUNCTION            1
             46 DUP_TOP             
             47 ROT_THREE           
             48 COMPARE_OP               0 (<)
             51 JUMP_IF_FALSE_OR_POP    63
             54 LOAD_CONST               2 (200)
             57 COMPARE_OP               0 (<)
             60 JUMP_FORWARD             2 (to 65)
        >>   63 ROT_TWO             
             64 POP_TOP             
        >>   65 POP_JUMP_IF_FALSE       71

  6          68 JUMP_ABSOLUTE            7

  7     >>   71 LOAD_CONST               3 (300)
             74 LOAD_GLOBAL              1 (len)
             77 LOAD_FAST                0 (item)
             80 CALL_FUNCTION            1
             83 DUP_TOP             
             84 ROT_THREE           
             85 COMPARE_OP               0 (<)
             88 JUMP_IF_FALSE_OR_POP   100
             91 LOAD_CONST               4 (400)
             94 COMPARE_OP               0 (<)
             97 JUMP_FORWARD             2 (to 102)
        >>  100 ROT_TWO             
            101 POP_TOP             
        >>  102 POP_JUMP_IF_FALSE        7

  8         105 JUMP_ABSOLUTE            7
            108 JUMP_ABSOLUTE            7
        >>  111 POP_BLOCK           
        >>  112 LOAD_CONST               0 (None)
            115 RETURN_VALUE        

In [25]: dis.dis(second)
  2           0 SETUP_LOOP             103 (to 106)
              3 LOAD_GLOBAL              0 (lis)
              6 GET_ITER            
        >>    7 FOR_ITER                95 (to 105)
             10 STORE_FAST               0 (item)

  3          13 LOAD_GLOBAL              1 (len)
             16 LOAD_FAST                0 (item)
             19 CALL_FUNCTION            1
             22 STORE_FAST               1 (x)

  4          25 LOAD_FAST                1 (x)
             28 LOAD_CONST               1 (100)
             31 COMPARE_OP               0 (<)
             34 POP_JUMP_IF_FALSE       40

  5          37 JUMP_ABSOLUTE            7

  6     >>   40 LOAD_CONST               1 (100)
             43 LOAD_FAST                1 (x)
             46 DUP_TOP             
             47 ROT_THREE           
             48 COMPARE_OP               0 (<)
             51 JUMP_IF_FALSE_OR_POP    63
             54 LOAD_CONST               2 (200)
             57 COMPARE_OP               0 (<)
             60 JUMP_FORWARD             2 (to 65)
        >>   63 ROT_TWO             
             64 POP_TOP             
        >>   65 POP_JUMP_IF_FALSE       71

  7          68 JUMP_ABSOLUTE            7

  8     >>   71 LOAD_CONST               3 (300)
             74 LOAD_FAST                1 (x)
             77 DUP_TOP             
             78 ROT_THREE           
             79 COMPARE_OP               0 (<)
             82 JUMP_IF_FALSE_OR_POP    94
             85 LOAD_CONST               4 (400)
             88 COMPARE_OP               0 (<)
             91 JUMP_FORWARD             2 (to 96)
        >>   94 ROT_TWO             
             95 POP_TOP             
        >>   96 POP_JUMP_IF_FALSE        7

  9          99 JUMP_ABSOLUTE            7
            102 JUMP_ABSOLUTE            7
        >>  105 POP_BLOCK           
        >>  106 LOAD_CONST               0 (None)
            109 RETURN_VALUE   

Upvotes: 2

Gareth Latty
Gareth Latty

Reputation: 88987

Python does not make the two equivalent. The reason being that the two are not equivalent for an arbitrary function. Let's consider this function, x():

y = 1

def x():
    return 1

And these two tests:

>>> print(x() + y)
2
>>> print(x() + y)
2

And:

>>> hw = x()
>>> print(hw + y)
2
>>> print(hw + y)
2

These are exactly the same, however, what if our function has side effects?

y = 1

def x():
    global y
    y += 1
    return 1

The first case:

>>> print(x() + y)
3
>>> print(x() + y)
4

The second case:

>>> hw = x()
>>> print(hw + y)
3
>>> print(hw + y)
3 

You can see that this optimization only works if the function has no side-effects, otherwise it can alter the program. As Python can't tell if a function has side-effects, it can't do this optimization.

As such, it makes sense to store the value locally and use it repeatedly, rather than calling the function again and again, although the reality is it is highly unlikely to matter as the difference will be tiny. That said, it's also much more readable and means you don't have to repeat yourself a lot, so it's generally a good idea to code that way.

Upvotes: 1

rainer
rainer

Reputation: 7099

You can check such things with dis.dis:

import dis

def somefunction1(item):
    if len(item) > 10:
        print 1
    elif len(item) > 10:
        print 2

def somefunction2(item):
    x = len(item)
    if x > 10:
        print 1
    elif x > 10:
        print 2

print "#1"
dis.dis(somefunction1)

print "#2"
dis.dis(somefunction2)

Interpreting the output:

#1
  4           0 LOAD_GLOBAL              0 (len)
              3 LOAD_FAST                0 (item)
              6 CALL_FUNCTION            1
              9 LOAD_CONST               1 (10)
             12 COMPARE_OP               4 (>)
             15 POP_JUMP_IF_FALSE       26
[...]
  6     >>   26 LOAD_GLOBAL              0 (len)
             29 LOAD_FAST                0 (item)
             32 CALL_FUNCTION            1
             35 LOAD_CONST               1 (10)
             38 COMPARE_OP               4 (>)
             41 POP_JUMP_IF_FALSE       52
[...]
#2
 10           0 LOAD_GLOBAL              0 (len)
              3 LOAD_FAST                0 (item)
              6 CALL_FUNCTION            1
              9 STORE_FAST               1 (x)

 11          12 LOAD_FAST                1 (x)
             15 LOAD_CONST               1 (10)
             18 COMPARE_OP               4 (>)
             21 POP_JUMP_IF_FALSE       32
[...]
 13     >>   32 LOAD_FAST                1 (x)
             35 LOAD_CONST               1 (10)
             38 COMPARE_OP               4 (>)
             41 POP_JUMP_IF_FALSE       52

You can see that in the first example, len(item) is called twice (see the two CALL_FUNCTION statements?), whereas it is only called one in the second implementation.

This means that the rest of your question boils down to how len() is implemented -- it is O(1) (ie. cheap) for e.g. lists, but especially for ones you might have built yourself, it need not be.

Upvotes: 1

Thanakron Tandavas
Thanakron Tandavas

Reputation: 5683

len() is O(1) operation. This mean the cost of calling len( ) is very cheap. So, stop worrying about it and better improve other part of your code.

However, personally, I think the second way is better. Because if I change your variable name from x to length, it will increase your code's readability.

def somefunction(somelist):
    for item in somelist:
        length = len(item)
        if length > 10:
            do something
        elif length > 6:
            do something
        elif length > 3:
            do something
        else:
            do something

NOTE: len( ) is O(1) with strings, sets, and dictionaries.

Upvotes: 2

Antimony
Antimony

Reputation: 39451

Python doesn't optimize things automatically like most other languages (unless you're using PyPy), so the second version is probably faster. But unless item has a custom len implementation that takes a while, it probably won't speed things up that much either. This is the sort of microoptimization that should be reserved for tight inner loops after profiling has indicated a problem.

Upvotes: 1

Related Questions