joeA
joeA

Reputation: 281

Explain and Clarify Haskell Counting Sort

I am working through Cormen et. al., Introduction to Algorithms, 3rd ed., but I also have an interest in Haskell. Section 8.2 (p. 194) covers counting sort. I was interested in how it and many algorithms might be implemented in haskell as they often use array access and destructive update. I took a look at the implementation on RosettaCode (copied below) and I find it very difficult to follow.

import Data.Array

countingSort :: (Ix n) => [n] -> n -> n -> [n]
countingSort l lo hi = concatMap (uncurry $ flip replicate) count
  where count = assocs . accumArray (+) 0 (lo, hi) . map (\i -> (i, 1)) $ l

One of the things I like about haskell is how algorithms can be very clear (e.g. Haskell quicksort examples), at least as a specification that hasn't been optimized. This seems very unclear and I wonder if it's necessarily so or just overdone.

Can someone

  1. clarify what's going on here,
  2. perhaps provide a more instructive and clear implementation, and
  3. tackle whether this is actually implementing counting sort or if non-strictness (lazyness) and immutability mean that this is actually some other sort disguised as counting sort?

Upvotes: 2

Views: 945

Answers (2)

user94559
user94559

Reputation: 60153

It is indeed doing a counting sort. Here's a slightly rewritten version that I find easier to understand:

import Data.Array

countingSort :: (Ix n) => [n] -> n -> n -> [n]
countingSort l lo hi = concat [replicate times n | (n, times) <- counts]
  where counts = assocs (accumArray (+) 0 (lo, hi) [(i, 1) | i <- l])

Let's break it down step by step. We'll use the list [5, 3, 1, 2, 3, 4, 5].

*Main> [(i, 1) | i <- [5, 3, 1, 2, 3, 4, 5]]
[(5,1),(3,1),(1,1),(2,1),(3,1),(4,1),(5,1)]

We're just taking every element of the list and turning it into a tuple with 1. This is the basis for our counts. Now we need a way to sum up those counts per element. This is where accumArray comes into play.

*Main> accumArray (+) 0 (1, 5) [(i, 1) | i <- [5, 3, 1, 2, 3, 4, 5]]
array (1,5) [(1,1),(2,1),(3,2),(4,1),(5,2)]

The first parameter to accumArray is the operation to apply during accumulation (just simple addition for us). The second parameter is the starting value, and the third parameter is the bounds. So we end up with an array mapping numbers to their counts in the input.

Next we use assocs to get key/value tuples from the map:

*Main> assocs $ accumArray (+) 0 (1, 5) [(i, 1) | i <- [5, 3, 1, 2, 3, 4, 5]]
[(1,1),(2,1),(3,2),(4,1),(5,2)]

And then replicate to repeat each number based on its count:

*Main> [replicate times n | (n, times) <- assocs $ accumArray (+) 0 (1, 5) [(i, 1) | i <- [5, 3, 1, 2, 3, 4, 5]]]
[[1],[2],[3,3],[4],[5,5]]

Finally, we use concat to turn this list of lists into a single list:

*Main> concat [replicate times n | (n, times) <- assocs $ accumArray (+) 0 (1, 5) [(i, 1) | i <- [5, 3, 1, 2, 3, 4, 5]]]
[1,2,3,3,4,5,5]

In the actual function I wrote above, I used where to break up this one-liner.

I find list comprehension to be easier to deal with than map and concatMap, so those are the main changes I made in my version of the function.

(uncurry $ flip replicate) is a nice trick... flip replicate gives you a version of replicate that takes its arguments in the opposite order. uncurry takes that curried function and turns it into a function that takes a tuple as an argument instead. Together, those produce the same result as my list comprehension, which destructured the tuple and then passed its parameters in reverse order. I'm not familiar enough with Haskell to know if this is a common idiom, but for me, the list comprehension was easier to follow.

Upvotes: 7

Luka Rahne
Luka Rahne

Reputation: 10487

count through accumArray works as histogram builder. That is, for each number from lo to hi it returns how many times the number occurs in the argument list.

count ::  (Ix i, Num e) => [i] -> i -> i -> [(i, e)]
count l lo hi = assocs . accumArray (+) 0 (lo, hi) . map (\i -> (i, 1)) $ l

count [6,2,1,6] 0 10 ==  
   [(0,0),(1,1),(2,1),(3,0),(4,0),(5,0),(6,2),(7,0),(8,0),(9,0),(10,0)]

Results of count are used to generate original elements back from this specification. That is done by replicateing each fst element of tuple snd number of times . This yields list of lists that are concatenated together.

 f l lo hi = map (uncurry $ flip replicate ) $ count l lo hi
 f [6,2,1,6] 0 10 == [[],[1],[2],[],[],[],[6,6],[],[],[],[]]

Full solution is equivalent to

countingSort l lo hi = concat $ f l lo hi

Upvotes: 3

Related Questions