Jason Hu
Jason Hu

Reputation: 6333

Memoized Collatz sequence

I've posted the same question in CodeReview but failed to get an answer. so I am trying my luck here in SO.

Here is one of my programs that utilized memoization and array to improve performance and memory usage. The performance seems satisfactory but the memory usage is ridiculous and I can't figure out what's wrong:

{-# LANGUAGE BangPatterns #-}
import Data.Functor
import Data.Array (Array)
import qualified Data.Array as Arr
import Control.DeepSeq

genColtzArr n = collatzArr
    where collatzArr = Arr.array (1, n) $ take n $ map (\v -> (v, collatz v 0)) [1..] 
          collatz 1 !acc  = 1 + acc
          collatz !m !acc
              | even m    = go (m `div` 2) acc
              | otherwise = go (3 * m + 1) acc
              where go !l !acc
                      | l <= n    = let !v = collatzArr Arr.! l in 1 + acc + v
                      | otherwise = collatz l $ 1 + acc

collatz here means this guy. This function is supposed to receive a number n, and then return an array indexing from 1 to n, and in which each cell contains the length of the link from the index to 1 by applying Collatz formula.

But the memory usage of this method is so high. Here is the profiler result (ghc option -prof -fprof-auto -rtsopts, run time option +RTS -p, n == 500000):

total alloc = 730,636,136 bytes  (excludes profiling overheads)

COST CENTRE              MODULE  %time %alloc

genColtzArr.collatz      Main     40.4   34.7
genColtzArr.collatz.go   Main     25.5   14.4


COST CENTRE                      MODULE                    no.     entries  %time %alloc   %time %alloc     

      genColtzArr                Main                      105           1    0.0    0.0    74.7   72.1
       genColtzArr.collatzArr    Main                      106           1    8.0   20.8    74.7   72.1
        genColtzArr.collatzArr.\ Main                      107      500000    0.9    2.2    66.8   51.3
         genColtzArr.collatz     Main                      109     1182582   40.4   34.7    65.9   49.1
          genColtzArr.collatz.go Main                      110     1182581   25.5   14.4    25.5   14.4

Please note that -O2 is not a desired answer. I want to figure out what's the problem in this program and in general, how should I spot time and memory inefficiencies in Haskell code. Specifically, I have no idea why this code, with tail recursion and bang pattern, can consume so much memory.

UPDATE1:

the same code with -s produces this:

   1,347,869,264 bytes allocated in the heap
     595,901,528 bytes copied during GC
     172,105,056 bytes maximum residency (7 sample(s))
         897,704 bytes maximum slop
             315 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      2408 colls,     0 par    0.412s   0.427s     0.0002s    0.0075s
  Gen  1         7 colls,     0 par    0.440s   0.531s     0.0759s    0.1835s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    0.828s  (  0.816s elapsed)
  GC      time    0.852s  (  0.958s elapsed)
  RP      time    0.000s  (  0.000s elapsed)
  PROF    time    0.000s  (  0.000s elapsed)
  EXIT    time    0.004s  (  0.017s elapsed)
  Total   time    1.684s  (  1.791s elapsed)

  %GC     time      50.6%  (53.5% elapsed)

  Alloc rate    1,627,861,429 bytes per MUT second

  Productivity  49.4% of total user, 46.4% of total elapsed

so it takes 300 meg. that is still too large.

Update2

full code

{-# LANGUAGE BangPatterns #-}
import Data.Functor
import Data.Array (Array)
import qualified Data.Array as Arr
import Control.DeepSeq

genColtzArr n = collatzArr
    where collatzArr = Arr.array (1, n) $ take n $ map (\v -> (v, collatz v 0)) [1..] 
          collatz 1 !acc  = 1 + acc
          collatz !m !acc
              | even m    = go (m `div` 2) acc
              | otherwise = go (3 * m + 1) acc
              where go !l !acc
                      | l <= n    = let !v = collatzArr Arr.! l in 1 + acc + v
                      | otherwise = collatz l $ 1 + acc


genLongestArr n = Arr.array (1, n) llist
    where colatz = genColtzArr n
          llist  = (1, 1):zipWith (\(n1, a1) l2 -> 
                                    let l1 = colatz Arr.! a1
                                     in (n1 + 1, if l2 < l1 then a1 else n1 + 1)) 
                                  llist (tail $ Arr.elems colatz)


main :: IO ()
main = getLine >> do
    ns <- map read <$> lines <$> getContents
    let m          = maximum ns
    let lar        = genLongestArr m
    let iter []    = return ()
        iter (h:t) = (putStrLn $ show $ lar Arr.! h) >> iter t
    iter ns

Upvotes: 0

Views: 321

Answers (1)

zakyggaps
zakyggaps

Reputation: 3080

As the other answer on CodeReview hints, it's alright for a 500000-element boxed array to comsume ~20MB memory, however it's not only the array but a lot of things all together:

collatz 500000 +RTS -hr -L50

Although you put bang patterns every where, array initialization itself is a lazy foldr:

-- from GHC.Arr
array (l,u) ies
    = let n = safeRangeSize (l,u)
      in unsafeArray' (l,u) n
                      [(safeIndex (l,u) n i, e) | (i, e) <- ies]

unsafeArray' :: Ix i => (i,i) -> Int -> [(Int, e)] -> Array i e
unsafeArray' (l,u) n@(I# n#) ies = runST (ST $ \s1# ->
    case newArray# n# arrEleBottom s1# of
        (# s2#, marr# #) ->
            foldr (fill marr#) (done l u n marr#) ies s2#)

So unless you evaluated the last bit of an array, it's holding reference to the list used in initialization. Usually the list can be GC'd on fly while you evaluating the array, but in your case the mutual references and self references disturbed the common GC pattern.

  • llist is self-referencing to produce every single element, so it will not be GC'd until you evaluated the last element of it
  • it also holds a reference to genColtzArr so genColtzArr won't be GC'd until llist is fully evaluated
  • you might think collatz is tail recursive but it's not, it's mutual recursive with collatzArr so again both of them won't be GC'd until fully evaluated

Everything combined, your program will keep three 500000-element list-like structures in memory and results ~80MB peak heap size.


Solution

The obvious solution is to force every array / list to normal form before it's used in another so you won't keep multiple copys of the same data in the memory.

genLongestArr :: Int -> Array Int Int
genLongestArr n =
  let collatz = genColtzArr n
  -- deepseq genColtzArr before mapping over it
  -- this is equivalent to your recursive definition
  in collatz `deepseq` (Arr.listArray (1,n) $ fmap fst $ scanl' (maxWith snd) (0, 0) $ Arr.assocs collatz)

maxWith :: Ord a => (b -> a) -> b -> b -> b
maxWith f b b' = case compare (f b) (f b') of
  LT -> b'
  _  -> b

And in main:

-- deepseq lar before mapping over it
-- this is equivalent to your iter loop
lar `deepseq` mapM_ (print . (lar Arr.!)) ns

Nothing can be done with genColtzArr, it's using itself for memorization so the mutual recursion is kind of necessary.

Now the heap graph peaks at ~20MB as it should:

collatz2 500000 +RTS -hr -L50

(Disclaimer: All programs in this answer were compiled with -O0)

Upvotes: 2

Related Questions