Affaeng
Affaeng

Reputation: 115

Quicksort: Performance issues with Python implementation

I am trying to implement a quicksort algorithm in python 3 that uses a partition function that checks for equal elements.

My solution seems to work find with respect to the solution, however needs up to 40 seconds on arrays of length 10^5 already, which is rather extreme.

I am fairly new to python and can not detect what causes the code to run this slowly.

I am using the following code:

import sys
import random

def partition3(a, l, r):

    # bringing all elements less than or equal to x to the front
    x = a[l]
    j = l;
    for i in range(l + 1, r + 1):
        if a[i] <= x:
            j += 1
            a[i], a[j] = a[j], a[i]
            #print('exchange happened:', a)
    a[l], a[j] = a[j], a[l]

    # shifting all elements with value x to the middle
    cnt = j-1
    xcnt = 0
    i = l
    while i < cnt:
        if a[i] == x:
            xcnt += 1
            while a[cnt] == x and cnt > i:
                cnt -= 1
            a[cnt], a[i] = a[i], a[cnt]
            cnt -= 1
        i += 1
    return j - xcnt, j

def randomized_quick_sort(a, l, r):
    if l >= r:
        return
    k = random.randint(l, r)
    a[l], a[k] = a[k], a[l]
    m1, m2 = partition3(a, l, r)
    randomized_quick_sort(a, l, m1 - 1);
    randomized_quick_sort(a, m2 + 1, r);

I would be very grateful to receive some advice on this issue.

Upvotes: 1

Views: 57

Answers (1)

Affaeng
Affaeng

Reputation: 115

I found the problem:

In the definition of partition3(a,l,r) instead of:

# shifting all elements with value x to the middle
cnt = j-1
xcnt = 0
i = l
while i < cnt:
    if a[i] == x:
        xcnt += 1
        while a[cnt] == x and cnt > i:
            cnt -= 1
        a[cnt], a[i] = a[i], a[cnt]
    i += 1

I needed to add a further xcnt += 1 to get correct counting of the elements equal to the the pivot, so that the code becomes:

# shifting all elements with value x to the middle
cnt = j-1
xcnt = 0
i = l
while i < cnt:
    if a[i] == x:
        xcnt += 1
        while a[cnt] == x and cnt > i:
            cnt -= 1
            xcnt += 1
        a[cnt], a[i] = a[i], a[cnt]
        cnt -= 1
    i += 1

This mistake lead to underperformance in data sets that contained very many equal values, because the length of the block of equal values was underestimated. Therefore many more than necessary function calls had been executed, reverting the solution essentially back to a standard Quicksort without accounting for equal elements at all.

Upvotes: 1

Related Questions