dbyrne
dbyrne

Reputation: 61011

Recursive function causing a stack overflow

I am trying to write a simple sieve function to calculate prime numbers in clojure. I've seen this question about writing an efficient sieve function, but I am not to that point yet. Right now I am just trying to write a very simple (and slow) sieve. Here is what I have come up with:

(defn sieve [potentials primes]
  (if-let [p (first potentials)]
    (recur (filter #(not= (mod % p) 0) potentials) (conj primes p))
    primes))

For small ranges it works fine, but causes a stack overflow for large ranges:

user=> (sieve (range 2 30) [])
[2 3 5 7 11 13 17 19 23 29]
user=> (sieve (range 2 15000) [])
java.lang.StackOverflowError (NO_SOURCE_FILE:0)

I thought that by using recur this would be a non-stack-consuming looping construct? What am I missing?

Upvotes: 39

Views: 2345

Answers (2)

Will Ness
Will Ness

Reputation: 71065

Algorithmically the problem is that you continue filtering when there's no more purpose to it. Stopping as early as possible achieves quadratic reduction in recursion depth (sqrt(n) vs. n):

(defn sieve [potentials primes]    
  (if-let [p (first potentials)]
      (if (> (* p p) (last potentials))
        (concat primes potentials)
        (recur (filter (fn [n] (not= (mod n p) 0)) potentials)
               (conj primes p)))
    primes))

Runs OK for 16,000 (performing just 30 iterations instead of 1862), and for 160,000 too, on ideone. Even runs 5% faster without the doall.

Upvotes: -1

Michał Marczyk
Michał Marczyk

Reputation: 84331

You're being hit by filter's laziness. Change (filter ...) to (doall (filter ...)) in your recur form and the problem should go away.

A more in-depth explanation:

The call to filter returns a lazy seq, which materialises actual elements of the filtered seq as required. As written, your code stacks filter upon filter upon filter..., adding one more level of filtering at each iteration; at some point this blows up. The solution is to force the whole result at each iteration so that the next one will do its filtering on a fully realised seq and return a fully realised seq instead of adding an extra layer of lazy seq processing; that's what doall does.

Upvotes: 56

Related Questions