Reputation: 126412
In the book "The Haskell Road to Logic, Maths and Programming", the authors present two alternative ways of finding the least divisor k
of a number n
with k > 1
, claiming that the second version is much faster than the first one. I have problems understanding why (I am a beginnner).
Here is the first version (page 10):
ld :: Integer -> Integer -- finds the smallest divisor of n which is > 1
ld n = ldf 2 n
ldf :: Integer -> Integer -> Integer
ldf k n | n `rem` k == 0 = k
| k ^ 2 > n = n
| otherwise = ldf (k + 1) n
If I understand this correctly, the ld
function basically ends up iterating through all integers in the interval [2..sqrt(n)]
and stops as soon as one of them divides n
, returning it as the result.
The second version, which the authors claim to be much faster, goes like this (page 23):
ldp :: Integer -> Integer -- finds the smallest divisor of n which is > 1
ldp n = ldpf allPrimes n
ldpf :: [Integer] -> Integer -> Integer
ldpf (p:ps) n | rem n p == 0 = p
| p ^ 2 > n = n
| otherwise = ldpf ps n
allPrimes :: [Integer]
allPrimes = 2 : filter isPrime [3..]
isPrime :: Integer -> Bool
isPrime n | n < 1 = error "Not a positive integer"
| n == 1 = False
| otherwise = ldp n == n
The authors claim that this version is faster because it iterates only through the list of primes within the interval 2..sqrt(n)
, instead of iterating through all numbers in that range.
However, this argument doesn't convince me: the recursive function ldpf
is eating one by one the numbers from the list of primes allPrimes
. This list is generated by doing filter
on the list of all integers.
So unless I am missing something, this second version ends up iterating through all numbers within the interval 2..sqrt(n)
too, but for each number it first checks whether it is prime (a relatively expensive operation), and if so, it checks whether it divides n
(a relatively cheap one).
I would say that just checking whether k
divides n
for each k
should be faster. Where is the flaw in my reasoning?
Upvotes: 4
Views: 509
Reputation: 63349
The main advantage of the second solution is that you compute the list of primes allPrimes
only once. Thanks to lazy evaluation, each call computes just the primes it needs, or reuses the ones that have been already computed. So the expensive part is computed only once and then just reused.
For computing the least divisor of just a single number, the first version is indeed more efficient. But try running ldp
and ld
for all numbers between let's say 1 and 100000 and you'll see the difference.
Upvotes: 8
Reputation: 18747
As I understand this, the division operation isn't as expensive as you might think for the divisor of 2, this makes for half of allPrimes
filtered out numbers to be checked for "shift right 1 bit" which is as simple as a computing operation can be, while the first algorithm will perform a relatively expensive true division by the whole of the number. Say if the possible divisor is 1956, it will be filtered out of allPrimes
by executing the very first test at almost no cost (shift right will return zero - divisible by 2) while dividing a say 2^4253-1
by 1956 is already senseless, as it's not divisible by 2, and in case of really big numbers divisions take a lot of time, and at least half of them (or say 5/6, for divisors 2 and 3) are useless. Also allPrimes
is a cached list, thus checking the next integer for a prime to be included in allPrimes
uses only verified primes, so the primality test isn't extremely expensive even for the actual prime number. This combined gives the second method advantage.
Upvotes: 1
Reputation: 51835
haskell is unknown for me so without proper measurements of booth version I can only assume that the claims are right. In that case the reasons can be:
1.primes are precomputed in some array
2.primes are computed on the run (and memoizing)
3.primes are computed on the run (and not memoizing)
4.this is just a guess
[Note]
Upvotes: 1