Let's say I have a dictionary: myDict = { 'title': 'a nice title', 'nice_list': [1,2,3,4,5,6,6,7,...,99999], 'nice_lists_last_item': 99999, } I only want to append an item to nice_list if it is larger than the final item. What is quicker: Using: if new_element > nice_list[-1] or Using: if new_element > nice_lists_last_item Does method 1 have to scan through the whole list (and/or put all of nice_list into memory each time) to find that item? Which is quicker? (bearing in mind I intend to do a few billion of these comparisons?) Method 2 would store the last element as its own distinct dict entry, so is that faster?

Reputation: 4473

List Lookup Performance - Does returning the last element of a list have to scan through the whole list?

Let's say I have a dictionary:

myDict = {
    'title': 'a nice title',
    'nice_list': [1,2,3,4,5,6,6,7,...,99999],
    'nice_lists_last_item': 99999,
}

I only want to append an item to nice_list if it is larger than the final item.

What is quicker:

Using: if new_element > nice_list[-1]

Using: if new_element > nice_lists_last_item

Does method 1 have to scan through the whole list (and/or put all of nice_list into memory each time) to find that item? Which is quicker? (bearing in mind I intend to do a few billion of these comparisons?)

Method 2 would store the last element as its own distinct dict entry, so is that faster?

Upvotes: 2

Answers (3)

senderle

Reputation: 151097

When in doubt, test:

>>> %timeit if 1 > myDict['nice_list'][-1]: 0
10000000 loops, best of 3: 110 ns per loop
>>> %timeit if 1 > myDict['nice_lists_last_item']: 0
10000000 loops, best of 3: 68.8 ns per loop
>>> nice_list = myDict['nice_list']
>>> %timeit if 1 > nice_list[-1]: 0
10000000 loops, best of 3: 62.6 ns per loop
>>> nice_lists_last_item = myDict['nice_lists_last_item']
>>> %timeit if 1 > nice_lists_last_item: 0                      
10000000 loops, best of 3: 43.4 ns per loop

As you can see, accessing the dictionary value directly is faster than accessing the list from the dictionary and then accessing its last value. But accessing the last value of the list directly is faster. This should be no surprise; Python lists know their own length and are implemented in memory as arrays, so finding the last item is as simple as subtracting 1 from the length and doing pointer arithmetic. Accessing dictionary keys is a bit slower because of the overhead of collision detection; but it's only slower by a few nanoseconds. And finally, if you really want to save a few more nanoseconds, you could store the last value in its own value.

The biggest slowdown comes when you do both.

Upvotes: 6

BrenBarn

Reputation: 251488

Getting an item from a list is O(1) as noted here. Even so, storing the value explicitly will still be faster, because no matter how fast the lookup is, it's still going to be slower than not doing a lookup at all. (However, if you store the value explicitly, you'll have to update it when you add a new item to the list; whether the combined cost if updating it and checking it is more than the cost of grabbing the last item every time is something you'll have to benchmark yourself; it will likely depend on how often you wind up actually appending a new item.)

Note that there is no question of "putting all of nice_list into memory". If you have a dict with a list in it, that entire list is already in memory. Looking up a value in it won't cause it to take up any more memory, but if you have billions of these lists, you will run out of memory before you even try to look anything up, because just creating the lists will use up too much memory.

Upvotes: 4

Elazar

Reputation: 21645

In CPython, the answer is probably no. list is implemented using dynamic arrays.

Upvotes: 1

List Lookup Performance - Does returning the last element of a list have to scan through the whole list?

Answers (3)

Related Questions