Dan
Dan

Reputation: 484

How does Python check if an element exists in a list?

Using the standard syntax that Python supplies to check if an element is in a list:

if someElement in someList:

What is actually being executed here? Is Python looping through every index and checking for equality, or is something more sophisticated being implemented?

The program I am writing is running extremely slowly. No math is being performed, but it relies heavily on checking to see if items exist in long lists. Is there a more rapid solution?

SOLVED: Checking if an element is in a list is the same as looping through every item and checking for equality. However, checking for an item in a set is significantly faster since the items are hashed.

Even if the items in your list are unhashable (in my case, other lists), it is still worth it to convert to a string, store in a set, and convert back when needed. At first, I thought this was bulky and would decrease performance. However, it literally allowed my program to complete in a matter of minutes when it would have taken days previously.

Don't underestimate the speed of checking items in a set.

Upvotes: 2

Views: 4880

Answers (3)

TessellatingHeckler
TessellatingHeckler

Reputation: 29023

Yes, Python is looping through every index. The in operator calls the __contains__() special method ( source ).

I think for a list - assuming CPython 2 - it ends up at this list_contains code in listobject.c, a straightforward for loop over the list items:

list_contains(PyListObject *a, PyObject *el)
{
    Py_ssize_t i;
    int cmp;

    for (i = 0, cmp = 0 ; cmp == 0 && i < Py_SIZE(a); ++i)
        cmp = PyObject_RichCompareBool(el, PyList_GET_ITEM(a, i),
                                           Py_EQ);
    return cmp;
}

Is there a more rapid solution?

Use a container with faster lookups - @vks suggests a set, dictionaries are also common. Both depend on your data being hashable with hash(item) or they can't work.

But the study of datastructures and their different performance characteristics is way too big for an answer, especially with no detail on what your task is, and with no given code and no given performance. Tree structures can also have faster lookups, but off-loading the work to an existing library written in C is a good Python tactic if you can find one.

Upvotes: 0

shx2
shx2

Reputation: 64328

Yes, it is looping through every index and checking for equality.

This:

someElement in someList

Is equivalent to:

any( x == someElement for x in someList )

To speed up, you can probably use a set instead of list, but that really depends on the type of the elements in your collections.

If the list is big, the lookup can be slow.

Upvotes: 6

vks
vks

Reputation: 67978

nc=set(someList)
if someElement in nc: #this will now be O(1) rather than O(n)

You can make a set out of your list and improve your performance.

Upvotes: 5

Related Questions