What version of this Python code is faster?

Question

Just an academic question. I'm curious what version of this code is better (faster) in Python?

var = random.randint(1, 10)
# First version
if var in [2, 5, 9]:
  print "First: 2, 5 or 9"

# Second version
if var == 2 or number == 5 or number == 9:
  print "Second: 2, 5 or 9"

This is very simple example, but what if variable var is not a number, but string?

var = 'aaa'
# First version
if var in ['aaa', 'zzz', 'eee']:
  print "String"

What about some more complicated objects (not only numbers or string, but some class with very time-consuming comparison)?

And what happens inside Python compiler? I suppose if var in list is executed like:

for l in list:
  if l == var:
    print "String"

So in my opinion both versions in first example are the same (speed). Am I right?

Raymond Hettinger · Accepted Answer

Timing the code will show which is fastest. Use the timeit module for that:

~ $ python -m timeit --setup 'var=2' 'var in [2, 5, 9]'
10000000 loops, best of 3: 0.0629 usec per loop
~ $ python -m timeit --setup 'var=5' 'var in [2, 5, 9]'
10000000 loops, best of 3: 0.0946 usec per loop
~ $ python -m timeit --setup 'var=9' 'var in [2, 5, 9]'
10000000 loops, best of 3: 0.117 usec per loop

~ $ python -m timeit --setup 'var=2' 'var == 2 or var==5 or var == 9'
10000000 loops, best of 3: 0.0583 usec per loop
~ $ python -m timeit --setup 'var=5' 'var == 2 or var==5 or var == 9'
10000000 loops, best of 3: 0.104 usec per loop
~ $ python -m timeit --setup 'var=9' 'var == 2 or var==5 or var == 9'
10000000 loops, best of 3: 0.127 usec per loop

If you want to improve your instincts on what Python is doing under-the-hood and knowing which code is fastest, start by disassembling code:

def f(x):
    var = random.randint(1, 10)
    # First version
    if var in [2, 5, 9]:
      print "First: 2, 5 or 9"

    # Second version
    if var == 2 or number == 5 or number == 9:
      print "Second: 2, 5 or 9"

import dis
dis.dis(f)

This will show that the this-or-this-or-that code does more steps than the list.__contains__ version.

FWIW, you may want too consider using sets instead of a list. Their O(1) lookups tend to beat lists depending on key frequency (is the first element of the list the most likely match), and on how expensive the hash function is, and on the number of elements (sets scale up better than lists with their O(n) search:

if var in {2, 5, 9}:
    ...

On my machine, sets did not help for searching a small number of integer elements:

~ $ python -m timeit --setup 'var=2' 'var in {2, 5, 9}'
1000000 loops, best of 3: 0.276 usec per loop
~ $ python -m timeit --setup 'var=5' 'var in {2, 5, 9}'
1000000 loops, best of 3: 0.281 usec per loop
~ $ python -m timeit --setup 'var=9' 'var in {2, 5, 9}'
1000000 loops, best of 3: 0.304 usec per loop

What version of this Python code is faster?

Answers (1)

Related Questions