CopOnTheRun
CopOnTheRun

Reputation: 1207

What causes the performance difference between these two implementations of insertion sort?

I've got two implementations of insertion sort. The first is pretty much a transcription of the example in my java textbook (albeit with a while loop instead of the java for loop)

def insertion_sort(array):
    for i in range(1,len(array)):
        j = i
        while j > 0 and array[j] < array[j-1]:
            array[j],array[j-1] = array[j-1], array[j]
            j=j-1
    return array

The second seems to be a more pythonic implementation.

def insertion_sort2(array):
    for i in range(1,len(array)):
        for j in range(i-1,-1,-1):
            if array[j+1] < array[j]:
                array[j+1],array[j] = array[j],array[j+1]
            else:
                break
    return array

After timing both, it seems the second is much slower (3 to 4 times slower) when the list is already sorted, or very nearly sorted. However, as the number of inversions increase the second implementation seems to gain the upper hand (10-15% faster). I was just wondering if someone could explain what cause of this discrepancy might be. Thanks!

Edit: Here's the function I use to create a random list.

def rand_list(length):
    return [random.randint(0,9*length) for _ in range(length)]

When I want a list that's partially sorted I just call list(range(length1)) + rand_list(length2).

As for timing how long they take to run I've used both the %timeit tool and the difference between two datetime.now() calls. They've both given pretty consistent results. Also I just want to emphasize, that I'm creating a new list every time, not sorting an already sorted list.

Upvotes: 4

Views: 205

Answers (1)

Raymond Hettinger
Raymond Hettinger

Reputation: 226296

I was able to reproduce your timings. For random data or reverse sorted data, insertion_sort2 is a bit faster than insertion_sort on Python3.6. However, as you noted, insertion_sort2 is three times slower for data that is already sorted.

The reason for the first case (slightly faster on random or sorted data) is that range_iterator objects generate decreasing integers faster (using C-code internally) than manually writing j=j-1 and j > 0 which are both pure python steps. Accordingly, once the for-loop is warmed-up, it runs a bit faster than the while-loop equivalent.

The reason for the second case (much slower on sorted data) is that a while-loop is cheaper than the equivalent for-loop when there are zero iterations. This is because the while-loop has no set-up costs. The equivalent for-loop still has to create both a range object and a range_iterator object before it can figure-out there a no iterations. Accordingly, while-loops beat for-loops when empty or nearly empty because they avoid set-up costs.

Upvotes: 1

Related Questions