Reputation: 13279
Given the following three functions
def v1(a):
c = 0
for a_ in a:
if a_ is not None:
c += 1
return c
def v2(a):
c = 0
for a_ in a:
if a_:
c += 1
return c
def v3(a):
c = 0
for a_ in a:
if bool(a_):
c += 1
return c
I get the following performance (I'm using python 3.6 on ubuntu 18.04)
values = [random.choice([1, None]) for _ in range(100000)]
%timeit v1(values)
3.35 ms ± 28 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit v2(values)
2.83 ms ± 36.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit v3(values)
12.3 ms ± 59.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
The similar performance between v1
and v2
makes sense, but why is v3
so much slower given that v2
is presumably implicitly calling bool(a_)
too?
Is it simply calling bool()
from python rather than from c (as I assume if
does) that's causing the difference in performance?
Upvotes: 1
Views: 364
Reputation: 160407
This is mainly due to Python's dynamicism and the fact that you have a Python level call.
Using bool
Python can't directly go and construct a new bool
object. It has to do look ups to find what exactly is attached to bool
; then it has check if it is something that can be called, parse its arguments and
then call it.
Using a construct such as if _a
, has a defined meaning. It goes through a specific OPCODE (POP_JUMP_IF_FALSE
here) and checks if the loaded value has a truthy value. Way less hoops to jump through.
bool
calls the same function to check if a value supplied is True
or False
, it just has a longer trip until it gets there.
Upvotes: 3
Reputation: 531125
v2
is able to evaluate the "truthiness" of a_
in the interpreter:
>>> dis.dis(v2)
...
11 14 LOAD_FAST 2 (a_)
16 POP_JUMP_IF_FALSE 10
...
where v3
is required to actually call bool
at the Python level:
>>> dis.dis(v3)
...
18 14 LOAD_GLOBAL 0 (bool)
16 LOAD_FAST 2 (a_)
18 CALL_FUNCTION 1
20 POP_JUMP_IF_FALSE 10
...
The function call is what slows v3
down.
Upvotes: 2