Reputation: 2508
I am getting unexpected(?) results from this mergesort implementation. It's extremely slow compared to my three-way quicksort(also written in python).
My quicksort finishes with 10000 elements after about 0.005s while mergesort needs 1.6s! Including the source code for both implementations.
Mergesort:
#Merges two sorted lists into one sorted list. recursively
def merge(left, right):
if len(left) == 0 and len(right) == 0:
return []
elif len(left) == 0:
return right
elif len(right) == 0:
return left
else:
if left[0] <= right[0]:
return left[:1] + merge(left[1:],right)
else:
return right[:1] + merge(left,right[1:])
#Splits a list in half and returns the halves
def halve(list):
return list[:len(list)//2],list[len(list)//2:]
#Mergesort
def mergesort(list):
if len(list) <= 1:
return list
left,right = halve(list)
left,right = mergesort(left),mergesort(right)
return merge(left,right)
Quicksort:
#Three-way QuickSort in Python
def quicksort(a):
if len(a) == 0:
return []
p = a[(len(a)-1)//2]
less = quicksort([x for x in a if x < p])
greater = quicksort([x for x in a if x > p])
equal = [x for x in a if x == p]
return less + equal + greater
Can someone come up with an explanation or maybe even a fix?
Upvotes: 2
Views: 1040
Reputation: 38247
Just out of curiosity, I wrote a quick implementation using generators (could be cleaner). How does this compare with those in the original method?
def merge(listA,listB):
iterA, iterB = iter(listA), iter(listB)
valA, valB = iterA.next(), iterB.next()
while True:
if valA <= valB:
yield valA
try:
valA = iterA.next()
except StopIteration:
yield valB
try:
while True:
yield iterB.next()
except StopIteration:
return
else:
yield valB
try:
valB = iterB.next()
except StopIteration:
yield valA
try:
while True:
yield iterA.next()
except StopIteration:
return
Upvotes: 0
Reputation: 107598
Guesses about performance are usually wrong, but i'll go with this once since i do have some experience with this. Profile if you really want to know:
You are adding lists, ie left[:1] + merge(left[1:],right)
, this is one of the slower operations in Python. It creates a new list from both lists, so your mergesort creates like N**2 intermediate lists. The quicksort on the other hand uses very fast LCs instead and creates less lists (I think like 2N or so).
Try using extend
instead of the +, maybe that helps.
Upvotes: 5
Reputation: 133975
A recursive mergesort is not really the best way to do things. You should get better performance with a straight iterative approach. I'm not very conversant with Python, so I'll give you the C-like pseudocode.
ileft = 0 // index into left array
iright = 0 // index into right array
iresult = 0 // index into result array
while (ileft < left.length && iright < right.length)
{
if (left[ileft] <= right[iright])
result[iresult++] = left[ileft++]
else
result[iresult++] = right[iright++]
}
// now clean up the remaining list
while (ileft < left.length)
result[iresult++] = left[ileft++]
while (iright < right.length)
result[iresult++] = right[iright++]
Upvotes: 3
Reputation: 12961
Explanation:
Typically, quicksort is significantly faster in practice than other Θ(nlogn) algorithms, because its inner loop can be efficiently implemented on most architectures, and in most real-world data, it is possible to make design choices which minimize the probability of requiring quadratic time.
Upvotes: 1