Reputation: 281
I am working through Cormen et. al., Introduction to Algorithms, 3rd ed., but I also have an interest in Haskell. Section 8.2 (p. 194) covers counting sort. I was interested in how it and many algorithms might be implemented in haskell as they often use array access and destructive update. I took a look at the implementation on RosettaCode (copied below) and I find it very difficult to follow.
import Data.Array
countingSort :: (Ix n) => [n] -> n -> n -> [n]
countingSort l lo hi = concatMap (uncurry $ flip replicate) count
where count = assocs . accumArray (+) 0 (lo, hi) . map (\i -> (i, 1)) $ l
One of the things I like about haskell is how algorithms can be very clear (e.g. Haskell quicksort examples), at least as a specification that hasn't been optimized. This seems very unclear and I wonder if it's necessarily so or just overdone.
Can someone
Upvotes: 2
Views: 945
Reputation: 60153
It is indeed doing a counting sort. Here's a slightly rewritten version that I find easier to understand:
import Data.Array
countingSort :: (Ix n) => [n] -> n -> n -> [n]
countingSort l lo hi = concat [replicate times n | (n, times) <- counts]
where counts = assocs (accumArray (+) 0 (lo, hi) [(i, 1) | i <- l])
Let's break it down step by step. We'll use the list [5, 3, 1, 2, 3, 4, 5]
.
*Main> [(i, 1) | i <- [5, 3, 1, 2, 3, 4, 5]]
[(5,1),(3,1),(1,1),(2,1),(3,1),(4,1),(5,1)]
We're just taking every element of the list and turning it into a tuple with 1. This is the basis for our counts. Now we need a way to sum up those counts per element. This is where accumArray
comes into play.
*Main> accumArray (+) 0 (1, 5) [(i, 1) | i <- [5, 3, 1, 2, 3, 4, 5]]
array (1,5) [(1,1),(2,1),(3,2),(4,1),(5,2)]
The first parameter to accumArray
is the operation to apply during accumulation (just simple addition for us). The second parameter is the starting value, and the third parameter is the bounds. So we end up with an array mapping numbers to their counts in the input.
Next we use assocs
to get key/value tuples from the map:
*Main> assocs $ accumArray (+) 0 (1, 5) [(i, 1) | i <- [5, 3, 1, 2, 3, 4, 5]]
[(1,1),(2,1),(3,2),(4,1),(5,2)]
And then replicate
to repeat each number based on its count:
*Main> [replicate times n | (n, times) <- assocs $ accumArray (+) 0 (1, 5) [(i, 1) | i <- [5, 3, 1, 2, 3, 4, 5]]]
[[1],[2],[3,3],[4],[5,5]]
Finally, we use concat to turn this list of lists into a single list:
*Main> concat [replicate times n | (n, times) <- assocs $ accumArray (+) 0 (1, 5) [(i, 1) | i <- [5, 3, 1, 2, 3, 4, 5]]]
[1,2,3,3,4,5,5]
In the actual function I wrote above, I used where
to break up this one-liner.
I find list comprehension to be easier to deal with than map
and concatMap
, so those are the main changes I made in my version of the function.
(uncurry $ flip replicate)
is a nice trick... flip replicate
gives you a version of replicate
that takes its arguments in the opposite order. uncurry
takes that curried function and turns it into a function that takes a tuple as an argument instead. Together, those produce the same result as my list comprehension, which destructured the tuple and then passed its parameters in reverse order. I'm not familiar enough with Haskell to know if this is a common idiom, but for me, the list comprehension was easier to follow.
Upvotes: 7
Reputation: 10487
count
through accumArray
works as histogram builder. That is, for each number from lo
to hi
it returns how many times the number occurs in the argument list.
count :: (Ix i, Num e) => [i] -> i -> i -> [(i, e)]
count l lo hi = assocs . accumArray (+) 0 (lo, hi) . map (\i -> (i, 1)) $ l
count [6,2,1,6] 0 10 ==
[(0,0),(1,1),(2,1),(3,0),(4,0),(5,0),(6,2),(7,0),(8,0),(9,0),(10,0)]
Results of count
are used to generate original elements back from this specification. That is done by replicate
ing each fst
element of tuple snd
number of times . This yields list of lists that are concatenated together.
f l lo hi = map (uncurry $ flip replicate ) $ count l lo hi
f [6,2,1,6] 0 10 == [[],[1],[2],[],[],[],[6,6],[],[],[],[]]
Full solution is equivalent to
countingSort l lo hi = concat $ f l lo hi
Upvotes: 3