horatius83
horatius83

Reputation: 393

foldr resulting in stack overflow

I'm trying to generate primes using the Sieve of Eratosthenes algorithm on an infinite list. I've heard that foldr will examine the list lazily, but every time I try to use the following algorithm I get a stack overflow exception:

getPrimes :: [Int]
getPrimes = foldr getNextPrime [2] [3,5..]
    where
        getNextPrime n primes
            | not $ isAnyDivisibleBy primes n = primes ++ [n]
            | otherwise = primes
        isAnyDivisibleBy primes n = any (\x -> isDivisibleBy n x) primes
        isDivisibleBy x y = x `mod` y == 0

example:

takeWhile (\x -> x < 10) getPrimes
*** Exception: stack overflow

Somewhere the list is getting evaluated, but I can't figure out where.

Upvotes: 2

Views: 92

Answers (3)

bheklilr
bheklilr

Reputation: 54058

foldr is defined as

foldr f z [] = z
foldr f z (x:xs) = f x (foldr f z xs)

So when you plug in the arguments you would get

foldr getNextPrime [2] [3,5..]
    = getNextPrime 3 (foldr getNextPrime [2] [5,7..])
    = getNextPrime 3 (getNextPrime 5 (foldr getNextPrime [2] [7,9..])
    etc...

For this to lazily produce values (what you want when dealing with infinite lists) then getNextPrime needs to lazily produce values. Looking at the definition of getNextPrime, primes ++ [n], meaning you're appending a value onto the end of your primes list, but primes for getNextPrime 3 is getNextPrime 5 (foldr getNextPrime [2] [7,9..]). But then primes for getNextPrime 5 is getNextPrime 7 (foldr getNextPrime [2] [9,11..]), etc, etc. You never actually are able to produce a normal form value for primes, it's always a chain of computations that never return.

Another way to look at this is to look at this is to replace getNextPrime with an operator, let's call it .:

foldr (.:) [2] [3,5..9]
    = 3 .: (5 .: (7 .: (9 .: [2])))

(this is why it's called a right fold, the parens are nested to the right)

This works great for using : in foldr:

foldr (:) [2] [3,5..9]
    = 3 : (5 : (7 : (9 : [2])

because : just builds a new data structure, and the first element of this data structure can be inspected without calculating the rest of the structure. But .: isn't so nice, it first needs to calculate x1 = 9 .: [2], then x2 = 7 .: x1, then x3 = 5 .: x2, and finally 3 .: x3. For [3,5..] instead, you never can calculate the last call of something .: [2], but haskell keeps trying to compute it and that blows the stack.

Upvotes: 2

Andr&#225;s Kov&#225;cs
Andr&#225;s Kov&#225;cs

Reputation: 30103

foldr getNextPrime [2] [3, 5 .. ] expands to:

(getNextPrime 3 (getNextPrime 5 (getNextPrime 7 ...

Since getNextPrime always needs to inspect its second argument, we just get a non-terminating chain of getNextPrime calls, and the initial [2] list is never used.

Upvotes: 2

dfeuer
dfeuer

Reputation: 48580

I think the foldr is confusing you, so let's write it out with explicit recursion:

getPrimes :: [Int]
getPrimes = getPrimesUsing [3,5..]

getPrimesUsing :: [Int]->[Int]
getPrimesUsing [] = [2]
getPrimesUsing (n:primes)
  | not $ isAnyDivisibleBy primes n = primes ++ [n]
  | otherwise = primes
  where
    primes = getPrimesUsing primes
    isAnyDivisibleBy primes n = any (\x -> isDivisibleBy n x) primes
    isDivisibleBy x y = x `mod` y == 0

Can you see the trouble now?

An unrelated point: the algorithm you seem to be trying to implement here is not the Sieve of Eratosthenes, but rather a much less efficient algorithm called trial division.

Upvotes: 3

Related Questions