Reputation: 15471
I am attempting to understand one of the prime number algorithms enumerated here: https://wiki.haskell.org/index.php?title=Prime_numbers&oldid=36858#Postponed_Filters_Sieve, specifically:
primes :: [Integer]
primes = 2: 3: sieve (tail primes) [5,7..]
where
sieve (p:ps) xs = h ++ sieve ps [x | x <- t, x `rem` p /= 0]
-- or: filter ((/=0).(`rem`p)) t
where (h,~(_:t)) = span (< p*p) xs
So conceptually I understand how this algorithm works (sieve of Erastothenes), start with 2,3, and a list of numbers, then eliminate any that are greater than the previous square and divisible by any below it.
But I'm having a hard time following along with the nested recursive step (prime calles sieve on primes, which calls sieve on primes which...)
I understand that this works due to lazy evaluation, and it demonstrably produces the right result, but I am incapable of following it.
So for example if I were to run take 5 primes
what would actually happen:
e.g (I will refer to the result of the take operation as t for ease of reading/reasoning):
Step 1)
primes returns a list [2,3, xs]
so t
is [2,3, take 3 xs]
where xs
is sieve (tail primes) [5,7..]
Step 2)
tail primes
is 3:xs
where xs
is sieve (tail primes) [5,7..]
etc
so t should now be [2,3,3,3,3,3...]
I have little trouble following sieve itself...
So I guess I have two questions.
1) How exactly does this algorithm actually work, and where/why is my trace wrong
2) Is there a way, generally, in Haskell to figure out what order things are running in? Maybe print a recursion tree? Or at the very least drop in a debugger halt?
Upvotes: 1
Views: 118
Reputation: 27003
I took the liberty of de-optimizing and clarifying the algorithm a little bit:
primes :: [Integer]
primes = 2 : sieve primes [3 ..]
sieve :: [Integer] -> [Integer] -> [Integer]
sieve [] xs = xs -- degenerate case for testing
sieve (p:ps) xs = h ++ sieve ps [x | x <- t, x `rem` p /= 0]
where (h, t) = span (< p*p) xs
This is the same base logic, but it does a lot more redundant work (a constant factor per output value, though) than the version you provided. I think that's a simpler starting point, and once you understand how this version works, it's easy to see what the optimizations do. I also pulled sieve
into its own definition. It didn't use anything from its enclosing scope, and the ability to test it standalone might help with understanding what's going on.
If you'd like to peek into how evaluation proceeds, you can use the Debug.Trace
module. The two functions I use most from it are trace
and traceShow
, depending on the value I want to see.
So, let's get a bit of tracing info from sieve
:
import Debug.Trace
primes :: [Integer]
primes = 2 : sieve primes [3 ..]
sieve :: [Integer] -> [Integer] -> [Integer]
sieve [] xs = trace "degenerate case for testing" xs
sieve (p:ps) xs = traceShow (p, h) $ h ++ sieve ps [x | x <- t, x `rem` p /= 0]
where (h, t) = span (< p*p) xs
And to test it out:
ghci> take 10 primes
[2(2,[3])
,3(3,[5,7])
,5,7(5,[11,13,17,19,23])
,11,13,17,19,23(7,[29,31,37,41,43,47])
,29]
Well, that's a lot less clear than hoped. When ghci prints out a result, it uses the Show
instance for the result's type. And the Show
instance for [Integer]
is lazy itself, so the printing of the list is getting interleaved with the tracing. To do better, let's have ghci produce a value that won't be output until after the tracing is complete. The sum
should do:
ghci> sum $ take 10 primes
129
That was.. less than useful. Where'd the tracing go? Well, remember that the tracing functions are very impure. Their explicit goal is to produce side effects. But GHC doesn't respect side effects. It assumes that all functions are pure. One result of that assumption is that it can store the result of evaluating expressions. (Whether it does so or not depends on whether there is a shared reference or CSE optimizations kick in. In this case, primes
itself is a shared reference.)
Maybe if we ask it to evaluate further than it has so far?
ghci> sum $ take 20 primes
(11,[53,59,61,67,71,73,79,83,89,97,101,103,107,109,113])
639
Ok, the tracing is separate from ghci's output as desired. But it's not really very informative at that point. To get a better picture, it needs to start back at the beginning. To do that, we need to get ghci to unload the definition of primes so that it will re-evaluate it from scratch. There are a bunch of ways to do this, but I'll demonstrate a method that has some additional ways to be useful.
ghci> :load *sieve.hs
[1 of 1] Compiling Main ( sieve.hs, interpreted )
Ok, modules loaded: Main.
By putting the *
in front of the file name in the :load
command, I instructed ghci to interpret the source from scratch, regardless of its current state. This works in this case because it forces a re-interpretation even though the source hasn't changed. It also is useful when you want to use :load
on a source that has compiled output in the current directory, and have it interpret the whole module, not just load the compiled code.
ghci> sum $ take 10 primes
(2,[3])
(3,[5,7])
(5,[11,13,17,19,23])
(7,[29,31,37,41,43,47])
129
Now, let's get into how the algorithm actually works. The first thing to look into what the components of the tracing output are. The first element is the prime whose multiples are being sieved out of the potential outputs. The second element is the list of values being accepted as primes because they're less than p*p
, and all non-primes less than that have already been removed from the candidate list. The mechanics of that should be familiar from any study of the sieve of Eratosthenes.
The calls to sieve
start with sieve primes [3..]
. The first place laziness critically comes into play is the pattern match on the first argument. The (:)
constructor is already known, so the pattern matches p
to the literal 2
, and ps
to an unevaluated expression. It's very important that it's unevaluated, because this call to sieve
is what calculates the value. If it forced it to be evaluated to proceed, it would introduce a circular data dependency, which results in an infinite loop.
As the tracing indicates, the prime being used to remove elements from the candidates is 2. The call to span
splits the input [3..]
into ([3], [4..])
. h
is [3]
, as demonstrated by the tracing output. So the result of the call to sieve
is [3] ++ <recursive call to sieve>
. This is the second place laziness critically comes into play in the algorithm. The implementation of (++)
doesn't do anything at all with its second argument until it has already produced the prefix of the list. This means that before the recursive call to sieve
is evaluated, it's known that ps
refers to a thunk that evaluates to [3] ++ <recursive call>
.
That's enough information to handle the recursive call to sieve
. Now, p
is matched to 3
, ps
is matched to a thunk, and the logic continues. The tracing should illustrate what's going on at this point.
Now, the version you started with does a few things to optimize. First, it observes that the first element of t
is always going to equal p*p
, and it uses pattern matching to eliminate that element without doing any remainder calculation on it. This is a small saving per prime examined, but it is a clear saving.
Second, it skips filtering out the multiples of two, and just doesn't generate them in the first place. This reduces the amount of elements generated to be filtered later by a factor of two, and it reduces the number of filters being applied to each odd element by one.
As an aside, note that the stacking filter behavior is actually algorithmically significant, and not faithful to the sieve of Eratosthenes as described in literature. For further discussion of this, see The Genuine Sieve of Eratosthenes by Melissa O'Neill.
Upvotes: 2