Aguy
Aguy

Reputation: 8059

Why 'in' is faster than `__contains__`?

I thought that in is just a manifest of __contains__

In [1]: li = [1, 2, 3, 4, 5]

In [2]: 4 in li
Out[2]: True

In [3]: li.__contains__(4)
Out[3]: True

However in is 2x faster

In [4]: %timeit 4 in li
10000000 loops, best of 3: 103 ns per loop

In [5]: %timeit li.__contains__(4)
The slowest run took 10.06 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 213 ns per loop

Can you explain the differences between the two and why is in faster?

Upvotes: 1

Views: 982

Answers (3)

user2357112
user2357112

Reputation: 281046

In the standard implementation of Python, in doesn't quite use __contains__ directly. in actually uses the C-level sq_contains function pointer in the struct representing an object's type. For types with a __contains__ implemented in Python, sq_contains points to a function that looks up and invokes the __contains__ method, but for types with a __contains__ implemented in C, sq_contains points straight to the C implementation of the method.

Since list.__contains__ is implemented in C, 4 in li gets to invoke the C implementation directly, without passing through the overhead of a Python method lookup and a Python function call. li.__contains__(4) has to perform the Python method lookup and Python function call, so it's substantially slower.

If you want to see the code path involved for 4 in li, you can follow the call hierarchy down from COMPARE_OP in the bytecode evaluation loop. You'll see that it uses cmp_outcome:

TARGET(COMPARE_OP)
{
    w = POP();
    v = TOP();
    if (PyInt_CheckExact(w) && PyInt_CheckExact(v)) {
        ...
    }
    else {
      slow_compare:
        x = cmp_outcome(oparg, v, w);
    }

cmp_outcome uses PySequence_Contains:

static PyObject *
cmp_outcome(int op, register PyObject *v, register PyObject *w)
{
    int res = 0;
    switch (op) {
    ...
    case PyCmp_IN:
        res = PySequence_Contains(w, v);
        if (res < 0)
            return NULL;
        break;

PySequence_Contains looks up the sq_contains field of the tp_as_sequence field of the C struct representing the list type:

int
PySequence_Contains(PyObject *seq, PyObject *ob)
{
    Py_ssize_t result;
    if (PyType_HasFeature(seq->ob_type, Py_TPFLAGS_HAVE_SEQUENCE_IN)) {
        PySequenceMethods *sqm = seq->ob_type->tp_as_sequence;
        if (sqm != NULL && sqm->sq_contains != NULL)
            return (*sqm->sq_contains)(seq, ob);

This field stores a function pointer to list_contains, the C function implementing the list containment check. At no point does Python have to perform any dict lookups to find the method, or allocate a method object to represent the method, or build a tuple of arguments to pass to the method.

Upvotes: 4

MSeifert
MSeifert

Reputation: 152677

This answer somewhat builds upon the explanation of @MosesKoledoye but without any disassembling of source code.

There are several reasons why the .__contains__ is slower than just using in:

  • Each time you access a method/function/class/property by using . this introduces a lookup inside the class/instance dictionary. This introduces some overhead.

  • For custom classes a call to in is translated to .__contains__ (or .__iter__) but this isn't necessarily true for all python builtins. Python lists are implemented in C and so a something in list can be translated to a C-function without needing to call __contains__ or __iter__.

  • The in list is directly forwarded to C code (COMPARE_OP) which is much faster than accessing a python function and then calling the C code (CALL_FUNCTION).

In general: For python built-ins using literals like in may be faster but this is (in general) not true for custom classes. Most of the data model in python applies to custom classes. list.__contains__ is probably only implemented to fit the data model conventions and not because it is needed or faster.

Upvotes: 2

Moses Koledoye
Moses Koledoye

Reputation: 78554

Probably the same reason why {} is faster than dict(). The method call introduces an extra overhead:

>>> from dis import dis
>>> li = [1, 2, 3, 4, 5]
>>> c = lambda: 4 in li
>>> d = lambda: li.__contains__(4)
>>> dis(c)
  1           0 LOAD_CONST               1 (4)
              3 LOAD_GLOBAL              0 (li)
              6 COMPARE_OP               6 (in)
              9 RETURN_VALUE
>>> dis(d)
  1           0 LOAD_GLOBAL              0 (li)
              3 LOAD_ATTR                1 (__contains__)
              6 LOAD_CONST               1 (4)
              9 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             12 RETURN_VALUE

See CALL_FUNCTION in the later case.

Upvotes: 5

Related Questions