Reputation: 817
I am fiddling with F# and using the FSI REPL I noticed that there was a huge efficiency discrepancy between two slightly different implementations of my beginner's naive Sieve of Eratosthenes implementation. The first with an extra if:
let rec sieve max current pList =
match current with
| 2 -> sieve max (current + 1) (current::pList)
| 3 -> sieve max (current + 2) (current::pList)
| n when n < max ->
if (n % 5 = 0) || (n % 3 = 0) then
sieve max (current + 2) (current::pList)
else if (pList |> (List.map (fun n -> current % n = 0)) |> List.contains true) then
sieve max (current + 2) (pList)
else
sieve max (current + 2) (current::pList)
| n when n >= max
-> pList
| _
-> printfn "Error: list length: %A, current: %A" (List.length pList) current
[-1]
and the second without:
let rec sieve max current pList =
match current with
| 2 -> sieve max (current + 1) (current::pList)
| 3 -> sieve max (current + 2) (current::pList)
| n when n < max ->
if (pList |> (List.map (fun n -> current % n = 0)) |> List.contains true) then
sieve max (current + 2) (pList)
else
sieve max (current + 2) (current::pList)
| n when n >= max
-> pList
| _
-> printfn "Error: list length: %A, current: %A" (List.length pList) current
[-1]
The one with the extra if
branch is actually slower despite the fact it should seem it would be faster.
I then timed both implementations in the REPL using the following:
#time
sieve 200000 2 []
#time
and saw that on my machine, the implementation with the extra if
statement took around 2 minutes while the one without took around 1 minute each time it was run.
How is this possible? By adding an if
statement which takes care of multiples of 3 or 5, it actually is slower than only mapping over the entire list and then finding if there are any divisors of a number in the list of primes.
How? Is it just that F# is THAT optimized for dealing with lists?
Upvotes: 3
Views: 169
Reputation: 1991
Of course, lists are not necessarily efficient. I once created a function to create an array of booleans where every prime number is true, and every non-prime is false:
let sieveArray limit =
let a = Array.create limit true
let rec setArray l h s x =
if l <= h then
a.[l] <- x
setArray (l + s) h s x
a.[0] <- false; a.[1] <- false
for i = 0 to a.Length - 1 do
if a.[i]
then setArray (i + i) (a.Length - 1) i false
a
To get a list of the actual primes, you can just map the indices, filter the resulting array:
sieveArray limit
|> Seq.mapi (fun i x -> i, x)
|> Seq.filter snd
|> Seq.map fst
|> Seq.toList
Upvotes: 0
Reputation: 13542
The extra if in the first sieve, presumably intended to be a shortcut, actually changes the behaviour a bit. Instead of removing everything divided by 3 and 5 it actually adds it. It is easy to see if you compare the output:
1st sieve: [19; 17; 15; 13; 11; 9; 7; 5; 3; 2]
2st sieve: [19; 17; 13; 11; 7; 5; 3; 2]
I assume that you want is something like this:
if (n % 5 = 0) || (n % 3 = 0) then
sieve max (current + 2) (pList)
However, in this case it will not include 5 (obviously). So the correct code is
if (n <> 5 && n % 5 = 0) || (n <> 3 && n % 3 = 0) then
sieve max (current + 2) (pList)
Check the performance of the code above - should be fine.
Upvotes: 3
Reputation: 80765
The extra if
performs the extra computation, but doesn't break the execution flow, which happily continues on to the second if
. So in effect, you have only added some useless computation to your function. No wonder it takes longer now!
You must be thinking of something like this:
if (a)
return b;
if (x)
return y;
else
return z;
Well, this won't work in F# the same way it works in C# or Java or whatever you're thinking in terms of. F# doesn't have "early return". There are no "statements", everything is an expression, everything has a result.
Adding such useless computation will actually generate a warning. If you pay attention to the warnings, you should notice one saying "this value is being discarded" or some such. The compiler is trying to help you out by pointing at a useless function call.
To fix this, replace the second if
with elif
:
if (n % 5 = 0) || (n % 3 = 0) then
sieve max (current + 2) (current::pList)
elif (pList |> (List.map (fun n -> current % n = 0)) |> List.contains true) then
sieve max (current + 2) (pList)
else
sieve max (current + 2) (current::pList)
This will make the second branch only execute when the first one fails.
P.S. Come to think of it, such an if
without an else
shouldn't even compile, because its result type cannot be determined. I'm not sure what's going on there.
P.P.S. List.map f |> List.contains true
is better expressed as List.exists f
. Both shorter and more performant.
Upvotes: 2