Reputation: 891
I'm beginning to try and get my head round haskell performance, and what makes things fast and slow, and I'm a little confused by this.
I have two implementations of a function that generates a list of primes up to a certain value. The first is straight off the Haskell wiki:
primesTo :: (Ord a, Num a, Enum a) => a -> [a]
primesTo m = eratos [2..m] where
eratos [] = []
eratos (p:xs) = p : eratos (xs `minus` [p*p, p*p+p..m])
The second is the same, but using an infinite list internally:
primes2 :: (Ord a, Num a, Enum a) => a -> [a]
primes2 m = takeWhile (<= m) (eratos [2..]) where
eratos [] = []
eratos (p:xs) = p : eratos (xs `minus` [p*p, p*p+p..])
In both cases, the minus function is:
minus :: (Ord a) => [a] -> [a] -> [a]
minus (x:xs) (y:ys) = case (compare x y) of
LT -> x : minus xs (y:ys)
EQ -> minus xs ys
GT -> minus (x:xs) ys
minus xs _ = xs
The latter implementation is significantly (~100x) slower than the former, and I don't get why. I would have thought that haskell's lazy evalutation would make them fairly equivalent under the hood.
This is obviously a reduced test case for the purposes of the question - in real life the optimisation would be no problem (although I don't understand why it is needed), but to me a function that just generates an infinite list of primes is more generically useful than a finite list, but appears slower to work with.
Upvotes: 4
Views: 138
Reputation: 74384
Looks like to me that there's a big difference between
(xs `minus` [p*p, p*p+p..m]) -- primesTo
(xs `minus` [p*p, p*p+p..]) -- primes2
The function minus
steps through lists pairwise and terminates when one list reaches the end. In the first minus
expression above, this occurs in no more than (m-p*p)/p
steps when the latter list is exhausted. In the second one, it will always take steps on the order of length xs
.
So your infinite lists have disabled at least one meaningful optimization.
Upvotes: 6
Reputation: 69964
One difference is that in the second case you need to generate one extra prime. You need to generate the first prime greater than m
before takeWhile
knows its time to stop.
Additionally, the [..m]
bounds on both the list to filter and the lists of multiples help reduce the number of calculations. Whenever one of these lists gets empty minus
immediately returns via its secons clause while in the infinite case the minus gets stuck in the first case. You can explore this a bit better if you also test the cases where only one of the lists is infinite:
--this is also slow
primes3 :: (Ord a, Num a, Enum a) => a -> [a]
primes3 m = takeWhile (<= m) (eratos [2..m]) where
eratos [] = []
eratos (p:xs) = p : eratos (xs `minus` [p*p, p*p+p..])
--this fast
primes4 :: (Ord a, Num a, Enum a) => a -> [a]
primes4 m = takeWhile (<= m) (eratos [2..]) where
eratos [] = []
eratos (p:xs) = p : eratos (xs `minus` [p*p, p*p+p..m])
Upvotes: 3