h__
h__

Reputation: 883

Why does this simplification make my function slower?

The following function computes the Fibonacci series by tail recursive and squaring:

(defun fib1 (n &optional (a 1) (b 0) (p 0) (q 1))
  (cond ((zerop n) b)
    ((evenp n) 
     (fib1 (/ n 2) 
          a 
          b 
          (+ (* p p) (* q q))
          (+ (* q q) (* 2 p q))))
    (t 
     (fib1 (1- n)
          (+ (* b q) (* a (+ p q)))
          (+ (* b p) (* a q))
          p
          q))))

Basically it reduces every odd input to a even one, and reduces every even input by half. For example,

F(21)
    = F(21 1 0 0 1) 
    = F(20 1 1 0 1) 
    = F(10 1 1 1 1) 
    = F(5 1 1 2 3)
    = F(4 8 5 2 3) 
    = F(2 8 5 13 21) 
    = F(1 8 5 610 987) 
    = F(0 17711 10946 610 987)
    = 10946

When I saw this I thought it might be better to combine the even and odd cases (since odd minus one = even), so I wrote

(defun fib2 (n &optional (a 1) (b 0) (p 0) (q 1))
    (if (zerop n) b
     (fib2 (ash n -1) 
          (if (evenp n) a (+ (* b q) (* a (+ p q))))
          (if (evenp n) b (+ (* b p) (* a q)))
          (+ (* p p) (* q q))
          (+ (* q q) (* 2 p q)))))

and hoping this will make it faster, as the equations above now becomes

F(21)
    = F(21 1 0 0 1)
    = F(10 1 1 1 1) 
    = F(5 1 1 2 3)
    = F(2 8 5 13 21) 
    = F(1 8 5 610 987) 
    = F(0 17711 10946 1346269 2178309)
    = 10946

However, it turned out to be much slower (takes about 50% more time in e.g. Clozure CL, CLisp and Lispworks) when I check the time needed for Fib(1000000) by the following code (Ignore the progn, I just don't want my screen filled with numbers.)

(time (progn (fib1 1000000)()))
(time (progn (fib2 1000000)()))

I can only see fib2 may do more evenp than fib1, so why is it that much slower?

EDIT: I think n.m. get it right, and I edited the second group of formulae. E.g. in the example of F(21) above, fib2 actually computes F(31) and F(32) in p and q, which is never used. So in F(1000000), fib2 computes F(1048575) and F(1048576).

Lazy evaluation rocks, that's a very good point. I guess in Common Lisp only some macro like "and" and "or" are evaluated lazily?

The following modified fib2 (defined for n>0) actually runs faster:

(defun fib2 (n &optional (a 1) (b 0) (p 0) (q 1))
    (if (= n 1) (+ (* b p) (* a q))
     (fib2 (ash n -1) 
          (if (evenp n) a (+ (* b q) (* a (+ p q))))
          (if (evenp n) b (+ (* b p) (* a q)))
          (+ (* p p) (* q q))
          (+ (* q q) (* 2 p q)))))

Upvotes: 3

Views: 246

Answers (2)

n. m. could be an AI
n. m. could be an AI

Reputation: 120049

Insert printing of the intermediate results. Pay attention to p and q towards the end of the computation.

You will notice that fib2 computes much larger values for p and q at the last step. These two values account for all the performance difference.

The ironic thing is that these expensive values are unused. This is why Haskell doesn't suffer from this performance problem: the unused values are not actually computed.

Upvotes: 3

Vatine
Vatine

Reputation: 21288

If nothing else, fib2 has more conditionals (while computing the arguments). That may well change how the code flow is done. Conditionals imply jumps, implies pipeline stalls.

It would probably be instructive to look at the generated code (try (disassemble #'fib1) and (disassemble #'fib2) and see if there's any blatant differences). It might also be worth to change the optimization settings, there's usually a fair few optimizations that are not done unless you request heavy optimization for speed.

Upvotes: 0

Related Questions