deltanovember
deltanovember

Reputation: 44051

In Scala, why does my Sieve algorithm runs so slowly?

I'm trying to implement the Sieve of Eratosthenes using lists and filters rather than arrays and looping. I'm not sure why the following performs significantly worse than an imperative equivalent. 1 million should absolutely fly but my machine grinds to a halt.

  val max = 1000000
  def filterPrimes(upper: Int, seed: Int = 2, sieve: List[Int] = List()): List[Int] =
    sieve.map(x => if (x % seed == 0 && x > seed) 0 else x).filter(_ > 0)

  var filtered: List[Int] = (2 to max).toList
  for (i <- 2 to max / 2) filtered = filterPrimes(max, i, filtered)
  filtered.foreach(println(_))

Upvotes: 8

Views: 502

Answers (5)

user unknown
user unknown

Reputation: 36229

This is a fast sieve, implementing hints of mergeconflict and some of the hints from the paper, mentioned by Ken Wayne VanderL :

def createPrimes (MAX: Int) : Array[Boolean] = {
  val pri = (false :: false :: true :: List.range (3, MAX + 1).map (_ % 2 != 0)).toArray
  for (i <- List.range (3, MAX)
    if (pri (i))) {
      var j = 2 * i;
      while (j < MAX) {
        if (pri (j))
          pri (j) = false;
        j += i;
      }
    }
  pri
}
val MAX = 1000*1000
(1 to MAX).filter (createPrimes (MAX))

Comparing graph: graph comparing 4 algos with different MAX-sizes The vertical axis shows seconds, the horizontal is from 100 000 to 1 000 000 primes. The deltaNovember-algorithm is already improved to run to math.sqrt(max) only, and the filtering, suggested by Alexey Romanov in the comment. From Dan Burton I took the second algorithm and the last one with a small modification, to fit my Interface (List, not Array) and the bitSet Sieve, which he only linked to in a comment, but which is the fastest.

Upvotes: 2

Dan Burton
Dan Burton

Reputation: 53675

I would make a few modifications.

  • It seems odd that you perform filterPrimes for all numbers between 2 and max / 2, the "actual" sieve technique requires you only perform filterPrimes for all primes between 2 and sqrt(max).
  • It also seems odd that you use a var and a for loop. To do it the "functional" way, I would use a recursive function instead.
  • Instead of performing filterPrimes on the entire list, you can collect the primes as you go; no need to throw those through the filter over and over.
  • It is rather strange to map and then filter the way you do, since the map simply flags which elements to filter, you can accomplish the same using only filter.

So here was my first attempt at these modifications:

def filterFactors(seed: Int, xs: List[Int]) = {
  xs.filter(x => x % seed != 0)
}

def sieve(max: Int) = {
  def go(xs: List[Int]) : List[Int] = xs match {
    case y :: ys => {
      if (y*y > max) y :: ys
      else y :: go(filterFactors(y, ys))
    }
    case Nil => Nil
  }

  go((2 to max).toList)
}

However, this reflects my Haskell bias, and has a huge flaw: it will take up a huge amount of stack space, due to the recursive call y :: go(...) in the go helper function. Running sieve(1000000) resulted in an "OutOfMemoryError" for me.

Let's try a common FP trick: tail recursion with accumulators.

def sieve(max: Int) = {
  def go(xs: List[Int],
         acc: List[Int])
         : List[Int] = xs match {
    case y :: ys => {
      if (y*y > max) acc.reverse ::: (y :: ys)
      else go(filterFactors(y, ys), y :: acc)
    }
    case Nil => Nil
  }

  go((2 to max).toList, Nil)
}

By adding an accumulator value we are able to write the go helper function in tail-recursive form, thus avoiding the huge stack problem from before. (Haskell's evaluation strategy is very different; it therefore neither needs nor benefits from tail recursion)

Now let's compare speed with an imperative, mutation-based approach.

def mutationSieve (max: Int) = {
    var arr: Array[Option[Int]] =
        (2 to max).map (x => Some (x)).toArray
    var i = 0
    var seed = (arr (i)).get
    while (seed * seed < max) {
        for (j: Int <- (i + seed) to (max - 2) by seed) {
          arr (j) = None
        }
        i += 1
        while (arr (i).isEmpty) {
          i += 1
        }
        seed = (arr (i)).get
    }
    arr.flatten
}

Here I use an Array[Option[Int]], and "cross off" a number by replacing its entry with "None". There is a tiny bit of room for optimization; perhaps a small speed boost could be obtained by using an array of bools, where the index represents the particular number. Whatever.

Using very primitive techinques (carefully placed new Date() calls...) I benchmarked the functional version to be approximately 6 times slower than the imperative version. It is clear that the two approaches have the same big-Oh time complexity, but the constant factors involved in programming with linked lists do incur a cost.

I also benchmarked your version, using Math.sqrt(max).ceil.toInt instead of max / 2: it was about 15x slower than the functional version I presented here. Interestingly, it is estimated1 that approximately 1 out of every 7 numbers between 1 and 1000 (sqrt(1000000)) is prime (1 / ln(1000)), therefore, a large part of the slowdown can be attributed to the fact that you perform the loop on every single number, while I perform my function only for every prime. Of course, if it took 15x as long to perform ~1000 iterations, it would take ~7500x as long to perform 500000 iterations, which is why your original code is agonizingly slow.

Upvotes: 5

Ray Toal
Ray Toal

Reputation: 88378

Lists are immutable and every call to filterPrimes creates a new list. You are creating a lot of lists, which is, by the way, unnecessary.

Go with your first instinct (what you probably call the "imperative equivalent") which I am guessing uses a single mutable array.

(Edited to make clear that I understood that the creation of multiple lists was unnecessary.)

Upvotes: 0

mergeconflict
mergeconflict

Reputation: 8276

There are a few potential issues, although I don't really see a single "smoking gun"... Anyway, here's what I've got. First:

sieve.map(x => if (x % seed == 0 && x > seed) 0 else x).filter(_ > 0)

could be written more concisely as:

sieve.filter(x => x <= seed || x % seed != 0)

Next, upper is unused in filterPrimes (this should have no effect on performance though).

Third, don't use a var and a for loop if you want to really use a pure functional style, instead turn filterPrimes into a tail-recursive function. The compiler might be clever enough to optimize away the copies if you do it this way (although I wouldn't hold my breath).

Finally, and probably most importantly, your for loop is wasting a huge amount of time filtering out values that have necessarily already been filtered. For example, it tries to filter multiples of 4 after already having filtered all multiples of 2. If you want to use this sieve algorithm efficiently, you need to choose your seeds from the remaining elements in the list.

In other words, keep an index into the list, and determine the seed from the index, like:

iteration 0: 2 3 4 5 6 7 8 9 ...
      index: ^

iteration 1: 2 3 5 7 9 ...
      index:   ^

iteration 2: 2 3 5 7 ...
      index:     ^

this avoids the duplicate effort. Also, you don't need to keep iterating until you get to max, I think you can actually stop when you get past sqrt(max).

Upvotes: 10

Ken Wayne VanderLinde
Ken Wayne VanderLinde

Reputation: 19349

If you'd like to see a functional way of doing the sieve, check out The Genuine Sieve of Eratosthenes.

Upvotes: 10

Related Questions