Madison May
Madison May

Reputation: 2753

Python List Indexing Efficiency

Quick question about the built in python list object. Say you have a list with the numbers 0 - 99. You are writing a program that takes the last item in the list and uses it for some other purpose. Is it more efficient to use list[-1] than to use list[99]? In other words, does python iterate through the whole list in either case?

Thanks for your help.

Upvotes: 15

Views: 15267

Answers (7)

Pengbert Wu
Pengbert Wu

Reputation: 1

mylist[-1] is about 30% to 45% slower than mylist[99] on my machine.

>>> def test():
...     t99=timeit.timeit('mylist[99]',setup='mylist=list(range(100))',number=10000000)
...     t_1=timeit.timeit('mylist[-1]',setup='mylist=list(range(100))',number=10000000)
...     return (t_1, t99, (t_1-t99)*100/t99)
... 
>>> test()
(0.21327159996144474, 0.13456149981357157, 58.49377441312871)
>>> test()
(0.17166510014794767, 0.13119220011867583, 30.850081020563916)
>>> test()
(0.19142579985782504, 0.13216119981370866, 44.842661936827426)
>>> test()
(0.1880386001430452, 0.1329137000720948, 41.47420472159728)
>>> test()
(0.18617470003664494, 0.1398134999908507, 33.159315837761085)
>>> test()
(0.17610100004822016, 0.1407316999975592, 25.13243288560744)
>>> test()
(0.19496860005892813, 0.14028189983218908, 38.983432853531)
>>> test()
(0.19262430001981556, 0.13199010002426803, 45.938445371584066)
>>> 

Upvotes: 0

Somesh
Somesh

Reputation: 1295

List indexing in python is always of O(1).

For further details on time complexity follow this link

Upvotes: 4

Samy Vilar
Samy Vilar

Reputation: 11120

you can timeit:

>>> import timeit
>>> timeit.Timer('values[99]', 'values = range(100)').timeit(number = 10**7)
0.44513392448425293
>>> timeit.Timer('values[99]', 'values = range(100)').timeit(number = 10**7)
0.45273900032043457
>>> timeit.Timer('values[-1]', 'values = range(100)').timeit(number = 10**7)
0.44431495666503906
>>> timeit.Timer('values[-1]', 'values = range(100)').timeit(number = 10**7)
0.44684290885925293
>>> timeit.Timer('values[-1]', 'values = range(100)').timeit(number = 10**7)
0.44867610931396484
>>> timeit.Timer('values[-1]', 'values = range(100)').timeit(number = 10**8)
4.4455509185791016
>>> timeit.Timer('values[99]', 'values = range(100)').timeit(number = 10**8)
4.4184651374816895
>>> timeit.Timer('values[99]', 'values = range(100)').timeit(number = 10**8)
4.4276700019836426
>>> timeit.Timer('values[-1]', 'values = range(100)').timeit(number = 10**8)
4.4026989936828613
>>> timeit.Timer('values[-1]', 'values = range(100)').timeit(number = 10**8)
4.4386618137359619
>>> timeit.Timer('values[99]', 'values = range(100)').timeit(number = 10**8)
4.3991479873657227
>>> 

theres really no difference, though if you truly want the last item values[-1] seems be the easiest/safest approach being that it will always grab the last item regardless of the length of the list, as long as its not an empty list,

that would throw an exception:

>>> [][-1]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range
>>> 

In other words, does python iterate through the whole list in either case?

In neither case would python iterate through the list.

I was actually curious to see if python did anything differently, so I disassemble the code:

>>> import dis
>>> def test_0():
...     values = range(100)
...     return values[99]
...
>>> def test_1():
...     values = range(100)
...     return values[-1]
...
>>> dis.dis(test_0)
2           
    0 LOAD_GLOBAL              0 (range)
    3 LOAD_CONST               1 (100)
    6 CALL_FUNCTION            1
    9 STORE_FAST               0 (values)

3          
    12 LOAD_FAST                0 (values)
    15 LOAD_CONST               2 (99)
    18 BINARY_SUBSCR
    19 RETURN_VALUE
>>> dis.dis(test_1)
2           
    0 LOAD_GLOBAL              0 (range)
    3 LOAD_CONST               1 (100)
    6 CALL_FUNCTION            1
    9 STORE_FAST               0 (values)

3          
    12 LOAD_FAST                0 (values)
    15 LOAD_CONST               2 (-1)
    18 BINARY_SUBSCR
    19 RETURN_VALUE
>>>

and well it looks like at the instruction level, its pretty much identical, you would need to go into the CPython implementation, to see exactly whats going on when dealing with negative indices, but I think most of the other answers have already hinted at it.

$ python --version
Python 2.6.1 

out of curiosity I dugged even deeper and found this:

on python 2.7.1, but it should be the same on most python 2.*

./Python/ceval.c:

case BINARY_SUBSCR:
    w = POP();
    v = TOP();
    if (PyList_CheckExact(v) && PyInt_CheckExact(w)) {
        /* INLINE: list[int] */
        Py_ssize_t i = PyInt_AsSsize_t(w);
        if (i < 0)
            i += PyList_GET_SIZE(v);
        if (i >= 0 && i < PyList_GET_SIZE(v)) {
            x = PyList_GET_ITEM(v, i);
            Py_INCREF(x);
        }
        else
            goto slow_get;
    }
    else
      slow_get:
        x = PyObject_GetItem(v, w);
    Py_DECREF(v);
    Py_DECREF(w);
    SET_TOP(x);
    if (x != NULL) continue;
    break;

note the if (i < 0) i += PyList_GET_SIZE(v); so basically yes theres a slight constant overhead when dealing with negative indices.

and in case your curious ./Include/listobject.h: #define PyList_GET_ITEM(op, i) (((PyListObject *)(op))->ob_item[i]) so its basically a look up ;)

though again the difference is tiny and if your goal is to state you want the last value then this values[-1] is much more pythonic/clearer of making this intent, values[99] simply means get the 99th value value if the programmer doesn't know it has 100 values then he doesn't know its the last value.

Upvotes: 6

kindall
kindall

Reputation: 184131

Python does not iterate through lists to find a particular index. Lists are arrays (of pointers to elements) in contiguous memory and so locating the desired element is always a simple multiplication and addition. If anything, list[-1] will be slightly slower because Python needs to add the negative index to the length to get the real index. (I doubt it is noticeably slower, however, because all that's done in C anyway.)

Upvotes: 23

Ashwini Chaudhary
Ashwini Chaudhary

Reputation: 250911

A simple timeit test result in almost equal times, with the negative index being slightly slower

lis=list(xrange(10000000))
def f1():
    a,b,c,d,e=lis[-1],lis[-2],lis[-3],lis[-4],lis[-5]    

def f2():
    a,b,c,d,e=lis[0],lis[1],lis[2],lis[3],lis[4]

if __name__=="__main__":
    from timeit import Timer
    t = Timer("f1()", "from __main__ import f1")
    print t.timeit()
    t = Timer("f2()", "from __main__ import f2")
    print t.timeit()

output:

0.878027235305
0.845932094722

Upvotes: 1

mgilson
mgilson

Reputation: 309881

Why not test and see?

import timeit
t=timeit.timeit('mylist[99]',setup='mylist=list(range(100))',number=10000000)
print (t)
t=timeit.timeit('mylist[-1]',setup='mylist=list(range(100))',number=10000000)
print (t)

Of course, as you can see by running this a couple times, there's really no (noticable) difference for the reasons pointed out in the other answers.

Upvotes: 8

Laurence Gonsalves
Laurence Gonsalves

Reputation: 143114

It doesn't iterate in either case. list[-1] is essentially identical to list[len(list) - 1]. A list is backed by an array, so lookups are constant time.

Upvotes: 4

Related Questions