Manual performGC hugely reduces memory footprint

Question

My program uses the GHC API in IO, doing some computation inside a GhcMonad and forcing the result before returning it; something like this:

main :: IO ()
main = do
    x <- runGhcT $ do
        x0 <- someGhcFunctionality
        x1 <- furtherProcessing
        liftIO . evaluate . force $ x1

    putStrLn "Done with GHC."
    _ <- getLine

    continueProcessingOutsideGhc x

At the pause point, I can see the process using 30+ GB of RAM; since continueProcessingOutsideGhc also uses some amount of memory on its own, this can lead to running out of memory in the middle of continueProcessingOutsideGhc.

However, what I have found is that manually forcing garbage collection at the pause point changes things drastically:

import System.Mem

main :: IO ()
main = do
    x <- runGhcT $ do
        x0 <- someGhcFunctionality
        x1 <- furtherProcessing
        liftIO . evaluate . force $ x1

    putStrLn "Done with GHC."
    _ <- getLine

    performGC 
    putStrLn "Done with performGC."
    _ <- getLine

    continueProcessingOutsideGhc x

That performGC line decreases the memory footprint by 85%, to about 4 GB. This is of course enough by a far margin to let continueProcessingOutsideGhc finish. I should also note that doing liftIO performGC inside runGhcT doesn't have the same effect; I guess that makes sense if the global GHC context is holding on to a lot of things.

What I'd like to understand is why all that garbage is left there after exiting runGhcT without the manual performGC.

Manual performGC hugely reduces memory footprint

Answers (1)

Related Questions