Philippe
Philippe

Reputation: 1307

Debugging a stack overflow in haskell

I'm new to Haskell and functional programming and I have a program which works but overflows the stack after a few seconds. My question is, what should I do from here? How can I get at least a hint of where it occurs, print the stack or anything?

The program is very slow when running in ghci with :trace so the stack overflow does not occur. It does not occur also with runhaskell which will just eat more and more memory. I get the error only when compiling with ghc and executing.

Upvotes: 10

Views: 1240

Answers (3)

David Miani
David Miani

Reputation: 14668

In your case, it is a strictness problem that is causing the stack overflow. One really easy way to find such issues is using the deepseq library. This adds a few functions that allow you to fully evaluate a value (which is better than seq, which only goes one level down). The key function is force :: NFData a => a -> a. This takes a value, fully evaluates it, and the returns it.

It only works on types that implement the NFData type class though. Luckily, there is a template haskell macro in the deepseq-th library: deriveNFData. This is used with your own data types, eg deriveNFData ''BfMachine.

To use, you put force $ in front of your functions that may be having strictness problems (or liftM force $ for monadic functions). Eg with your code, I put it in front of step, since that was the key function in the file:

{-# LANGUAGE TemplateHaskell #-}
import Data.Char
import Debug.Trace
import Control.DeepSeq
import Control.DeepSeq.TH
import Control.Monad (liftM)

type Stack = [Int]

data BfMachine = BfMachine
    { program :: String
    , pc :: Int
    , stack :: Stack
    , sp :: Int
    } deriving Show
deriveNFData ''BfMachine

setElem :: [Int] -> Int -> Int -> [Int]
setElem list n value = map (\(i, v) -> if i == n then value else v) (zip [0..] list)

step :: BfMachine -> IO (BfMachine)
step m@(BfMachine { program = program, pc = pc, stack = stack, sp = sp }) = liftM force $
    case program !! pc of
    '-' -> return m { pc = pc + 1, stack = setElem stack sp ((stack !! sp) - 1) }
    '+' -> return m { pc = pc + 1, stack = setElem stack sp ((stack !! sp) + 1) }
    '<' -> return m { pc = pc + 1, sp = sp - 1 }
    '>' -> return m { pc = pc + 1, sp = sp + 1 }
    '[' -> return $ if stack !! sp /= 0 then m { pc = pc + 1 }
                    else m { pc = (findNextBracket program $ pc + 1) + 1 }
    ']' -> return m { pc = findPrevBracket program $ pc - 1 }
    '.' -> do putChar $ chr $ stack !! sp
              return m { pc = pc + 1 }
    ',' -> do c <- getChar
              let s' = setElem stack sp $ ord c
                 in return m { stack = s',  pc = pc + 1 }
    a -> return m { pc = pc + 1 }

findNextBracket :: String -> Int -> Int
findNextBracket program pos =
    case program !! pos of
    '[' -> findNextBracket program $ (findNextBracket program $ pos + 1) + 1
    ']' -> pos
    x -> findNextBracket program (pos + 1)

findPrevBracket :: String -> Int -> Int
findPrevBracket program pos =
    case program !! pos of
    ']' -> findPrevBracket program $ (findPrevBracket program $ pos - 1) - 1
    '[' -> pos
    x -> findPrevBracket program (pos - 1)

isFinished :: BfMachine -> Bool
isFinished m@(BfMachine { program = p, pc = pc })
    | pc == length p = True
    | otherwise = False

run :: BfMachine -> IO ()
run m = do
    if isFinished m then
        return ()
    else do
        m <- step m
        run m

fib = ">++++++++++>+>+[ [+++++[>++++++++<-]>.<++++++[>--------<-]+<<<]>.>>[ [-]<[>+<-]>>[<<+>+>-]<[>+<-[>+<-[>+<-[>+<-[>+<-[>+<- [>+<-[>+<-[>+<-[>[-]>+>+<<<-[>+<-]]]]]]]]]]]+>>> ]<<< ] This program doesn't terminate; you will have to kill it.  Daniel B Cristofani (cristofdathevanetdotcom) http://www.hevanet.com/cristofd/brainfuck/"
main = run BfMachine { program = fib , pc = 0, stack = replicate 1024 0, sp = 0 }

This actually solves the problem - even after a few minutes running, it hasn't crashed and memory usage is only 3.2MB.

You can stick with that solution, or try to find where the real strictness problem is (as that makes everything strict). You do this by removing the force from the step function, and trying it on the helper functions it uses (eg setElem, findPrevBacket, etc). It turns out that setElem is the culprit, putting force in front of that function also solves the strictness problem. I'm guessing it is because the if in the map lambda means most values never have to be evaluated in the list, and possibly build up huge thunks as the program continues.

Upvotes: 3

David Miani
David Miani

Reputation: 14668

The most simple strategy is using the trace function. Eg consider this function:

badFunction :: Int -> Int
badFunction x
 | x < 10 = x * 2
 | x == 15 = badFunction 480
 | even x = badFunction $ x `div` 2
 | odd x = badFunction $ x + 1

main = print . badFunction . read . head =<< getArgs

Eg if you run ./program 13, you will get 42. However, if you run ./program 29, you will get a stack overflow.

To debug this, place trace statements for each case, (from Debug.Trace):

badFunction :: Int -> Int
badFunction x
 | x < 10 = trace ("badF -> small " ++ show x) x * 6
 | x == 15 = trace "badF -> x == 15" $ badFunction 480
 | even x = trace ("badF -> even " ++ show x) $ badFunction $ x `div` 2
 | odd x = trace ("badF -> odd " ++ show x) badFunction $ x + 1

trace has the type String -> a -> a, and prints the given string, then returns the value of the second argument. It is a special function, as it performs IO in a pure function. It is great for debugging though.

In this case, running the program now with ./program 19, you will get the output:

badF -> odd 19
badF -> even 20
badF -> even 10
badF -> small 5
30

Showing exactly what was called.

If you now run it with ./program 29, you get:

badF -> odd 29
badF -> even 30
badF -> x == 15
badF -> even 960
badF -> even 480
badF -> even 240
badF -> even 120
badF -> even 60
badF -> even 30
badF -> x == 15
badF -> even 960
badF -> even 480
badF -> even 240
badF -> even 120
badF -> even 60
badF -> even 30
badF -> x == 15
badF -> even 960
badF -> even 480
badF -> even 240
badF -> even 120
badF -> even 60
badF -> even 30
badF -> x == 15
badF -> even 960
....

This pretty clearly shows how the loop is occurring. While in this example it was pretty obvious where the problem was, it is useful for more complex functions (especially if the stack overflow involves multiple functions - just do this with all the functions you suspect might be the problem).

Upvotes: 1

nponeccop
nponeccop

Reputation: 13677

See http://book.realworldhaskell.org/read/profiling-and-optimization.html for general guidelines on profiling

Upvotes: 1

Related Questions