Reputation: 317

Why that version of mergesort is faster

Based on that answer here are two versions of merge function used for mergesort. Could you help me to understand why the second one is much faster. I have tested it for list of 50000 and the second one is 8 times faster (Gist).

def merge1(left, right):
    i = j = inv = 0
    merged = []
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            merged.append(left[i])
            i += 1
        else:
            merged.append(right[j])
            j += 1
            inv += len(left[i:])

    merged += left[i:]
    merged += right[j:]
    return merged, inv

def merge2(array1, array2):
    inv = 0
    merged_array = []
    while array1 or array2:
        if not array1:
            merged_array.append(array2.pop())
        elif (not array2) or array1[-1] > array2[-1]:
            merged_array.append(array1.pop())
            inv += len(array2)
        else:
            merged_array.append(array2.pop())
    merged_array.reverse()
    return merged_array, inv

Here is the sort function:

def _merge_sort(list, merge):
    len_list = len(list)
    if len_list < 2:
        return list, 0
    middle = len_list / 2
    left, left_inv   = _merge_sort(list[:middle], merge)
    right, right_inv = _merge_sort(list[middle:], merge)
    l, merge_inv = merge(left, right)
    inv = left_inv + right_inv + merge_inv
    return l, inv

import numpy.random as nprnd
test_list = nprnd.randint(1000, size=50000).tolist()

test_list_tmp = list(test_list) 
merge_sort(test_list_tmp, merge1)

test_list_tmp = list(test_list) 
merge_sort(test_list_tmp, merge2)

Upvotes: 7

Answers (3)

kreativitea

Reputation: 1791

You can use the excellent cProfile module to help you solve things like this.

>>> import cProfile
>>> a = range(1,20000,2)
>>> b = range(0,20000,2)
>>> cProfile.run('merge1(a, b)')
         70002 function calls in 0.195 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.184    0.184    0.195    0.195 <pyshell#7>:1(merge1)
        1    0.000    0.000    0.195    0.195 <string>:1(<module>)
    50000    0.008    0.000    0.008    0.000 {len}
    19999    0.003    0.000    0.003    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


>>> cProfile.run('merge2(a, b)')
         50004 function calls in 0.026 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.016    0.016    0.026    0.026 <pyshell#12>:1(merge2)
        1    0.000    0.000    0.026    0.026 <string>:1(<module>)
    10000    0.002    0.000    0.002    0.000 {len}
    20000    0.003    0.000    0.003    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
    20000    0.005    0.000    0.005    0.000 {method 'pop' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'reverse' of 'list' objects}

After looking at the information a bit, it looks like the commenters are correct-- its not the len function-- it's the string module. The string module is invoked when you compare the length of things, as follows:

>>> cProfile.run('0 < len(c)')
         3 function calls in 0.000 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

It is also invoked when slicing a list, but this is a very quick operation.

>>> len(c)
20000000
>>> cProfile.run('c[3:2000000]')
         2 function calls in 0.011 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.011    0.011    0.011    0.011 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

TL;DR: Something in the string module is taking 0.195s in your first function, and 0.026s in your second function. : apparently, the rebuilding of the array in inv += len(left[i:]) this line.

Upvotes: 3

will

Reputation: 10650

Similar answer as kreativitea's above, but with more info (i think!)

So profiling the actual merge functions, for the merging of two 50K arrays,

merge 1

         311748 function calls in 15.363 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001   15.363   15.363 <string>:1(<module>)
        1   15.322   15.322   15.362   15.362 merge.py:3(merge1)
   221309    0.030    0.000    0.030    0.000 {len}
    90436    0.010    0.000    0.010    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

merge2

         250004 function calls in 0.104 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001    0.104    0.104 <string>:1(<module>)
        1    0.074    0.074    0.103    0.103 merge.py:20(merge2)
    50000    0.005    0.000    0.005    0.000 {len}
   100000    0.010    0.000    0.010    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
   100000    0.014    0.000    0.014    0.000 {method 'pop' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'reverse' of 'list' objects}

So for merge1, it's 221309 len, 90436 append, and takes 15.363 seconds. So for merge2, it's 50000 len, 100000 append, and 100000 pop and takes 0.104 seconds.

len and append pop are all O(1) (more info here), so these profiles aren't showing what's actually taking the time, since going of just that, it should be faster, but only ~20% so.

Okay the cause is actually fairly obvious if you just read the code:

In the first method, there is this line:

inv += len(left[i:])

so every time that is called, it has to rebuild an array. If you comment out this line (or just replace it with inv += 1 or something) then it becomes faster than the other method. This is the single line responsible for the increased time.

Having noticed this is the cause, the issue can be fixed by improving the code; change it to this for a speed up. After doing this, it will be faster than merge2

inv += len(left) - i

Update it to this:

def merge3(left, right):
    i = j = inv = 0
    merged = []
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            merged.append(left[i])
            i += 1
        else:
            merged.append(right[j])
            j += 1
            inv += len(left) - i

    merged += left[i:]
    merged += right[j:]
    return merged, inv

Upvotes: 10

jeremy

Reputation: 4314

If I had to guess, I would say it probably has to do with the cost of removing elements from a list, removing from the end (pop) is quicker than removing from the beginning. the second favors removing elements from the end of the list.

See Performance Notes: http://effbot.org/zone/python-list.htm

"The time needed to remove an item is about the same as the time needed to insert an item at the same location; removing items at the end is fast, removing items at the beginning is slow."

Upvotes: 0