lenhhoxung
lenhhoxung

Reputation: 2746

list and dictionary: which one is faster

I have the following pieces of code doing the sorting of a list by swapping pairs of elements:

# Complete the minimumSwaps function below.
def minimumSwaps(arr):
    counter = 0
    val_2_indx = {val: arr.index(val) for val in arr}
    for indx, x in enumerate(arr):
        if x != indx+1:
            arr[indx] = indx+1
            s_indx = val_2_indx[indx+1]
            arr[s_indx] = x

            val_2_indx[indx+1] = indx
            val_2_indx[x] = s_indx
            counter += 1
    return counter

def minimumSwaps(arr):
    temp = [0] * (len(arr) + 1)
    for pos, val in enumerate(arr):
        temp[val] = pos
    swaps = 0
    for i in range(len(arr)):
        if arr[i] != i+1:
            swaps += 1
            t = arr[i]
            arr[i] = i+1
            arr[temp[i+1]] = t
            temp[t] = temp[i+1]
            temp[i+1] = i
    return swaps

The second function works much faster than the first one. However, I was told that dictionary is faster than list. What's the reason here?

Upvotes: 1

Views: 55

Answers (1)

kaya3
kaya3

Reputation: 51037

A list is a data structure, and a dictionary is a data structure. It doesn't make sense to say one is "faster" than the other, any more than you can say that an apple is faster than an orange. One might grow faster, you might be able to eat the other one faster, and they might both fall to the ground at the same speed when you drop them. It's not the fruit that's faster, it's what you do with it.

If your problem is that you have a sequence of strings and you want to know the position of a given string in the sequence, then consider these options:

  • You can store the sequence as a list. Finding the position of a given string using the .index method requires a linear search, iterating through the list in O(n) time.
  • You can store a dictionary mapping strings to their positions. Finding the position of a given string requires looking it up in the dictionary, in O(1) time.

So it is faster to solve that problem using a dictionary. But note also that in your first function, you are building the dictionary using the list's .index method - which means doing n linear searches each in O(n) time, building the dictionary in O(n^2) time because you are using a list for something lists are slow at. If you build the dictionary without doing linear searches, then it will take O(n) time instead:

    val_2_indx = { val: i for i, val in enumerate(arr) }

But now consider a different problem. You have a sequence of numbers, and they happen to be the numbers from 1 to n in some order. You want to be able to look up the position of a number in the sequence:

  • You can store the sequence as a list. Finding the position of a given number requires linear search again, in O(n) time.
  • You can store them in a dictionary like before, and do lookups in O(1) time.
  • You can store the inverse sequence in a list, so that lst[i] holds the position of the value i in the original sequence. This works because every permutation is invertible. Now getting the position of i is a simple list access, in O(1) time.

This is a different problem, so it can take a different amount of time to solve. In this case, both the list and the dictionary allow a solution in O(1) time, but it turns out it's more efficient to use a list. Getting by key in a dictionary has a higher constant time than getting by index in a list, because getting by key in a dictionary requires computing a hash, and then probing an array to find the right index. (Getting from a list just requires accessing an array at an already-known index.)

This second problem is the one in your second function. See this part:

    temp = [0] * (len(arr) + 1)
    for pos, val in enumerate(arr):
        temp[val] = pos

This creates a list temp, where temp[val] = pos whenever arr[pos] == val. This means the list temp is the inverse permutation of arr. Later in the code, temp is used only to get these positions by index, which is an O(1) operation and happens to be faster than looking up a key in a dictionary.

Upvotes: 3

Related Questions