LSM07
LSM07

Reputation: 817

.NET Optimization of F# Sieve of Eratosthenes

I am fiddling with F# and using the FSI REPL I noticed that there was a huge efficiency discrepancy between two slightly different implementations of my beginner's naive Sieve of Eratosthenes implementation. The first with an extra if:

let rec sieve max current pList =
    match current with
    | 2 -> sieve max (current + 1) (current::pList)
    | 3 -> sieve max (current + 2) (current::pList)
    | n when n < max ->
        if (n % 5 = 0) || (n % 3 = 0) then
            sieve max (current + 2) (current::pList)
        else if (pList |> (List.map (fun n -> current % n = 0)) |> List.contains true) then
            sieve max (current + 2) (pList)
        else
            sieve max (current + 2) (current::pList)
    | n when n >= max
        -> pList
    | _
        ->  printfn "Error: list length: %A, current: %A" (List.length pList) current
            [-1]

and the second without:

let rec sieve max current pList =
    match current with
    | 2 -> sieve max (current + 1) (current::pList)
    | 3 -> sieve max (current + 2) (current::pList)
    | n when n < max ->
        if (pList |> (List.map (fun n -> current % n = 0)) |> List.contains true) then
            sieve max (current + 2) (pList)
        else
            sieve max (current + 2) (current::pList)
    | n when n >= max
        -> pList
    | _
        ->  printfn "Error: list length: %A, current: %A" (List.length pList) current
            [-1]

The one with the extra if branch is actually slower despite the fact it should seem it would be faster. I then timed both implementations in the REPL using the following:

#time sieve 200000 2 [] #time

and saw that on my machine, the implementation with the extra if statement took around 2 minutes while the one without took around 1 minute each time it was run. How is this possible? By adding an if statement which takes care of multiples of 3 or 5, it actually is slower than only mapping over the entire list and then finding if there are any divisors of a number in the list of primes. How? Is it just that F# is THAT optimized for dealing with lists?

Upvotes: 3

Views: 169

Answers (3)

dumetrulo
dumetrulo

Reputation: 1991

Of course, lists are not necessarily efficient. I once created a function to create an array of booleans where every prime number is true, and every non-prime is false:

let sieveArray limit =
    let a = Array.create limit true
    let rec setArray l h s x =
        if l <= h then
            a.[l] <- x
            setArray (l + s) h s x
    a.[0] <- false; a.[1] <- false
    for i = 0 to a.Length - 1 do
        if a.[i]
        then setArray (i + i) (a.Length - 1) i false
    a

To get a list of the actual primes, you can just map the indices, filter the resulting array:

sieveArray limit
|> Seq.mapi (fun i x -> i, x)
|> Seq.filter snd
|> Seq.map fst
|> Seq.toList

Upvotes: 0

Alex Netkachov
Alex Netkachov

Reputation: 13542

The extra if in the first sieve, presumably intended to be a shortcut, actually changes the behaviour a bit. Instead of removing everything divided by 3 and 5 it actually adds it. It is easy to see if you compare the output:

1st sieve: [19; 17; 15; 13; 11; 9; 7; 5; 3; 2]
2st sieve: [19; 17; 13; 11; 7; 5; 3; 2]

I assume that you want is something like this:

if (n % 5 = 0) || (n % 3 = 0) then
    sieve max (current + 2) (pList)

However, in this case it will not include 5 (obviously). So the correct code is

if (n <> 5 && n % 5 = 0) || (n <> 3 && n % 3 = 0) then
    sieve max (current + 2) (pList)

Check the performance of the code above - should be fine.

Upvotes: 3

Fyodor Soikin
Fyodor Soikin

Reputation: 80765

The extra if performs the extra computation, but doesn't break the execution flow, which happily continues on to the second if. So in effect, you have only added some useless computation to your function. No wonder it takes longer now!

You must be thinking of something like this:

if (a)
    return b;
if (x)
    return y;
else 
    return z;

Well, this won't work in F# the same way it works in C# or Java or whatever you're thinking in terms of. F# doesn't have "early return". There are no "statements", everything is an expression, everything has a result.

Adding such useless computation will actually generate a warning. If you pay attention to the warnings, you should notice one saying "this value is being discarded" or some such. The compiler is trying to help you out by pointing at a useless function call.

To fix this, replace the second if with elif:

if (n % 5 = 0) || (n % 3 = 0) then
    sieve max (current + 2) (current::pList)
elif (pList |> (List.map (fun n -> current % n = 0)) |> List.contains true) then
    sieve max (current + 2) (pList)
else
    sieve max (current + 2) (current::pList)

This will make the second branch only execute when the first one fails.

P.S. Come to think of it, such an if without an else shouldn't even compile, because its result type cannot be determined. I'm not sure what's going on there.

P.P.S. List.map f |> List.contains true is better expressed as List.exists f. Both shorter and more performant.

Upvotes: 2

Related Questions