Reputation: 737
I have implemented two versions of Merge Sort in Haskell like follows:
mergeSort1 :: (Ord a) => [a] -> [a]
mergeSort1 xs = foldl' (\acc x -> merge [x] acc) [] xs
and
mergeSort2 :: (Ord a) => [a] -> [a]
mergeSort2 [] = []
mergeSort2 (x:[]) = [x]
mergeSort2 xs = (mergeSort2 $ fst halves) `merge` (mergeSort2 $ snd halves)
where halves = splitList xs
where 'merge' and 'splitList' are implemented as follows:
merge :: (Ord a) => [a] -> [a] -> [a]
merge [] [] = []
merge xs [] = xs
merge [] ys = ys
merge all_x@(x:xs) all_y@(y:ys)
| x < y = x:merge xs all_y
| otherwise = y:merge all_x ys
splitList :: [a] -> ([a], [a])
splitList zs = go zs [] [] where
go [] xs ys = (xs, ys)
go [x] xs ys = (x:xs, ys)
go (x:y:zs) xs ys = go zs (x:xs) (y:ys)
Doing last $ mergeSort2 [1000000,999999..0]
in ghci results in showing the number 1000000 after more than a minute of processing, while doing last $ mergeSort1 [1000000,999999..0]
results in showing the last element only after 5 seconds.
I can understand why mergeSort1 uses much less memory than mergeSort2 because of the tail-recursiveness of foldl' and so.
What I can't understand is why mergeSort1 is faster than mergeSort2 by such a big difference ?
Could it be that splitList is the bottleneck in mergeSort2, generating two new lists every call?
Upvotes: 3
Views: 932
Reputation: 183888
As is,
mergeSort2 :: (Ord a) => [a] -> [a]
mergeSort2 xs = (mergeSort2 $ fst halves) `merge` (mergeSort2 $ snd halves)
where halves = splitList xs
is an infinite recursion, since you haven't given a base case (you need to specify the result for lists of length . After that is fixed, < 2
)mergeSort2
is still relatively slow due to the splitList
which requires a complete traversal in each step and builds two new lists, not allowing to process anything before that is completed. A simple
splitList zs = splitAt h zs where h = length zs `quot` 2
does much better.
Your mergeSort1
, however, is not a merge sort at all, it is an insertion sort.
mergeSort1 :: (Ord a) => [a] -> [a]
mergeSort1 xs = foldl' (\acc x -> merge [x] acc) [] xs
That does particularly well on reverse-sorted input, but if you give it sorted or random input, it scales quadratically.
So mergeSort1
was faster because you gave it optimal input, where it finishes in linear time.
Upvotes: 9