Won Kim
Won Kim

Reputation: 109

Quick sort counting

Python questions again. I want to count the number of comparison operations performed by quick sort. Because I use a recursive function, I do not think that assigning count = 0 to the beginning of the function body is inappropriate, so I made it as follows.

def QuickSort(lst, count = 0):
    if len(lst) > 1:
        pivot_idx = len(lst) // 2
        smaller_nums, larger_nums = [], []

        for idx, num in enumerate(lst):

            if idx != pivot_idx:
                if num < lst[pivot_idx]:
                    smaller_nums.append(num)

                else:
                    larger_nums.append(num)

        count = QuickSort(smaller_nums, count + 1)[1]
        count = QuickSort(larger_nums, count + 1)[1]
        lst[:] = smaller_nums + [lst[pivot_idx]] + larger_nums

    return lst, count

However, after counting, I confirmed the count which is much lower than my expectation. According to big o, the quick sort would have to show the calculation of n * log (n), but it showed a much lower count. For example, when sorting a list with 1000 random elements, we expected to see a count of 1000 * log (1000) = 6907, but actually only 1164 counts. I am wondering if I am misusing the count in the function or misunderstanding it. Thank you.

Upvotes: 0

Views: 1694

Answers (2)

Joe Larson
Joe Larson

Reputation: 53

Building on the answer provided by Gene by adding print statements and a sort "error" range, his example was very helpful to my understanding of quicksort and an error term on the big O impact of operations performance comparison.

    class QuickSort:

    def __init__(self):
        self.comparisons = 0

    def sort(self, lst):
        k_err = 0 # k << n, the value the sort array can be in error 
        if len(lst) <= 1:
            return lst
        pivot_idx = len(lst) // 2
        smaller, larger = [], []
        for idx, num in enumerate(lst):
            if idx != (pivot_idx) :
                self.comparisons += 1
                try:
                  (larger, smaller)[(num - k_err) < lst[pivot_idx]].append(num)
                except:
                  (larger, smaller)[(num + k_err) < lst[pivot_idx]].append(num)
            print(pivot_idx,"larger", self.comparisons, larger)
            print(pivot_idx, "smaller", self.comparisons,  smaller, )
        return self.sort(smaller) + [lst[pivot_idx]] + self.sort(larger)

def Run(n):
    random.seed(100) 
    lst = [random.randint(0,round(100,0)) for r in range(n)]
    quicksort = QuickSort()
    print(len(lst), lst)
    print(quicksort.sort(lst))
    print(quicksort.comparisons, quicksort.comparisons/n, ((quicksort.comparisons/n)/math.log(n,10)), math.log(n,10)  )

Upvotes: 0

Gene
Gene

Reputation: 46990

Your post is mistaken on several points:

  • Big-O is allows arbitrary constant factors and also ignoring the values for "small" values of n, where "small" can be arbitrarily large for any given analysis. So your computations are meaningless.
  • Your counts are wrong. There's one comparison per loop iteration. You're counting something else.
  • This is a strange way to code the count. Just use a global variable.

Try this. Note really you're using twice as many comparisons as this reports. The check that the loop index isn't the pivot could be eliminated with a smarter implementation.

c = 0

def QuickSort(lst):
    if len(lst) <= 1:
        return lst
    pivot_idx = len(lst) // 2
    smaller, larger = [], []
    for idx, num in enumerate(lst):
        if idx != pivot_idx:
            global c
            c += 1
            (larger, smaller)[num < lst[pivot_idx]].append(num)
    return QuickSort(smaller) + [lst[pivot_idx]] + QuickSort(larger)

def Run(n):
    lst = [random.randint(0,1000) for r in xrange(n)]
    QuickSort(lst)
    print c

Run(1000)

If you're aghast at the prospect of using a global variable, then you can just wrap the sort in a class:

import random

class QuickSort:

    def __init__(self):
        self.comparisons = 0

    def sort(self, lst):
        if len(lst) <= 1:
            return lst
        pivot_idx = len(lst) // 2
        smaller, larger = [], []
        for idx, num in enumerate(lst):
            if idx != pivot_idx:
                self.comparisons += 1
                (larger, smaller)[num < lst[pivot_idx]].append(num)
        return self.sort(smaller) + [lst[pivot_idx]] + self.sort(larger)

def Run(n):
    lst = [random.randint(0,1000) for r in xrange(n)]
    quicksort = QuickSort()
    print quicksort.sort(lst)
    print quicksort.comparisons

Run(100)

Upvotes: 2

Related Questions