capone
capone

Reputation: 758

Haskell performance : Inversion count algorithm

I have decided to solve first programing assignment from Standford algorithm course https://class.coursera.org/algo-005 using Haskell. Despite I am very new to language I implemented it much faster than in c++. I have 6+ years of work experience in c++ so it impressed me a bit. But performance is disappointing: 0.19 sec (c++) vs 9.88 (haskell) version. How can I improve performance of Haskell implementation so it can be comparable to c++?

Here is my code in Haskell

data SortedList = SortedList {
    inversionCount :: Int,
    list :: [Int]
} deriving (Show) 

--      first   list        accumulator
packm :: Int -> SortedList -> Int -> SortedList
packm x (SortedList count xs) add =  SortedList (count + add) (x:xs)

merge2 :: [Int] -> [Int] -> SortedList
merge2 [] xs = SortedList 0 xs
merge2 xs [] = SortedList 0 xs
merge2 xlist@(x:xs) ylist@(y:ys)
    | x < y = packm x (merge2 xs ylist) 0
    | otherwise = packm y (merge2 xlist ys) $ length xlist

countAndMerge :: SortedList -> SortedList -> SortedList
countAndMerge (SortedList lcount lxs) (SortedList rcount rxs) =
    let merged = merge2 lxs rxs
    in SortedList (lcount + rcount + inversionCount merged) $ list merged

mergesort :: [Int] -> SortedList
mergesort [] = SortedList 0 []
mergesort [x] = SortedList 0 [x]
mergesort xs =
    let leftsorted = mergesort $ take halfElements xs
        rightsorted = mergesort $ drop halfElements xs
    in countAndMerge leftsorted rightsorted
    where halfElements = length xs `div` 2

main = do 
    contents <- getContents
    let intlist = [ read x :: Int | x <- (lines contents) ]
    print $ inversionCount $ mergesort intlist

Upvotes: 4

Views: 643

Answers (2)

Andr&#225;s Kov&#225;cs
Andr&#225;s Kov&#225;cs

Reputation: 30103

The biggest problem is that the asymptotic performance isn't right to begin with; it's O(n^2 * log n) rather than the optimal O(n * log n). The culprit is merge2:

    | otherwise = packm y (merge2 xlist ys) $ length xlist

length xlist is O(n). Supposing a random input list, we need to compute length xlist on about half of the merge2 calls, thus making one level of merging O(n^2).

Upvotes: 5

n. m. could be an AI
n. m. could be an AI

Reputation: 120031

otherwise = packm y (merge2 xlist ys) $ length xlist

This computes length at every other step of the merge on the average. This makes the whole business quadratic.

If you track length of lists not by counting elements, but by passing the count down from the top level, you restore the O(N log N) behaviour. For a list of 100000 elements this means execution time goes down from 20 seconds to 0.45 second (on my machine with -O2).

Scaling it further up without changing the algorithm is problematic, because it currently runs in linear stack space, and cannot cope with 1 million elements with default RTS options. Change mergesort to a merge-adjacent-pairs version, it is likely to run much better.

Upvotes: 2

Related Questions