Haskell, terminal call optimisation and lazy evaluation

I'm trying to write a findIndexBy which would return the index of an element selected in a list by an ordering function. This function is equivalent to sorting the list and returning the top element, but I want to implement it to be able to process lists without size limit.

findIndexBy :: (Ord a) => (a -> a -> Bool) -> [a] -> Integer
findIndexBy f (x:xs) = findIndexBy' xs x 1 0
  where
    findIndexBy' [] _ _ i = i
    findIndexBy' (x:xs) y xi yi = if f x y
      then findIndexBy' xs x (xi + 1) xi
      else findIndexBy' xs y (xi + 1) yi

With this implementation, I get a Stack space overflow when processing big list, as in the following example (trivial):

findIndexBy (>) [1..1000000]

I know there should be more elegant solutions to solve this problem, and I'm interested in knowing the most idiomatic and efficient ones, but I really want to understand what is wrong with my function.

I might be wrong, but I think my implementation of findIndexBy' is based on terminal recursion, so I don't really understand why the compiler doesn't seem to optimize the tail call.

I thought it might be due to the if/then/else and also trying the following, which results in the same error:

findIndexBy :: (Ord a) => (a -> a -> Bool) -> [a] -> Integer
findIndexBy f (x:xs) = findIndexBy' xs x 1 0
  where
    findIndexBy' [] _ _ i = i
    findIndexBy' (x:xs) y xi yi = findIndexBy' xs (if f x y then x else y) (xi + 1) (if f x y then xi else yi)

Is there a simple way to ask the compiler to show where tail-call optimization is (not) performed?

For reference, below is the equivalent function that I wrote in Clojure, and that I am now trying to port to Haskell:

(defn index-of [keep-func, coll]
  (loop [i 0
         a (first coll)
         l (rest coll)
         keep-i i]
    (if (empty? l)
      keep-i
      (let [keep (keep-func a (first l))]
        (recur
          (inc i) (if keep a (first l)) (rest l) (if keep keep-i (inc i)))))))

For information, the previously quoted Haskell code was compiled using the -O3 flag.

[edit after leventov answer]

The problem seems to be related to lazy evaluation. Although I found about $! and seq, I wonder what is the best practice when using them to fix the original code.

I'm still interested with more idiomatic implementations relying on functions from Data.List.

[edit]

The simplest fix is to add yi `seq` in the first snippet before the if statement.

Upvotes: 2

Answers (3)

kqr

Reputation: 15038

When you have realised that laziness is the issue, the second thing to look at is the general pattern you have implemented in your code. It seems to me like you are really just iterating a list and carrying along an intermediary value which you then return when the list is empty – this is a fold! And indeed you can implement your function in terms of fold:

findIndexBy f =
  snd . foldl1' (\x y -> if f x y then x else y) . flip zip [0..]

First, this function pairs up each element with its index (flip zip [0..]) in an (element, index) list. Then foldl1' (the strict version of the fold that crashes for empty lists) runs along the list and yanks out the tuple that satisfies your f. Then the index of this tuple (snd in this case) is returned.

Since we used a strict fold here, it will also resolve your problem without additional strictness annotations for GHC.

Upvotes: 2

Sassa NF

Reputation: 5406

Your code needs the accumulator value in order to produce the return value, so this is the case where laziness loses.
When accumulator is lazy, you get a looong chain of thunks that need evaluating in the end. This is what crashes your function. Declaring the accumulator strict, you get rid of thunks and it works on large lists. The use of foldl' is typical in such cases.
The difference in the Core:

Without bangs:

main_findIndexBy' =
  \ ds_dvw ds1_dvx ds2_dvy i_aku ->
    case ds_dvw of _ {
      [] -> i_aku;
      : x_akv xs_akw ->
          ...
          (plusInteger ds2_dvy main4)

With bangs:

main_findIndexBy' =
  \ ds_dyQ ds1_dyR ds2_dyS i_akE ->
    case ds_dyQ of _ {
      [] -> i_akE;
      : x_akF xs_akG ->
        case ds2_dyS of ds3_Xzb { __DEFAULT ->
        ...
        (plusInteger ds3_Xzb main4)

The difference is minute, indeed. In the first case it uses the original argument, ds2_dvy, to add 1 to it, in the second case it first pattern-matches the value of the argument - without even looking at what it matched -, which causes evaluation of it, and the value gets into ds3_Xzb.

Upvotes: 3

leventov

Reputation: 15313

Adding bang patterns has worked for me. i. e.

{-# LANGUAGE BangPatterns #-}
findIndexBy :: (Ord a) => (a -> a -> Bool) -> [a] -> Integer
findIndexBy f (x:xs) = findIndexBy' xs x 1 0
  where
    findIndexBy' [] _ _ i = i
    findIndexBy' (x:xs) !y !xi !yi = findIndexBy' xs (if f x y then x else y) (xi + 1) (if f x y then xi else yi)

To look at what GHC does with code, compile as ghc -O3 -ddump-simpl -dsuppress-all -o tail-rec tail-rec.hs > tail-rec-core.hs

See Reading GHC Core.

However, I haven't found much difference between Core output with and without bang patterns.

Upvotes: 3

Haskell, terminal call optimisation and lazy evaluation

Answers (3)

Related Questions