Memoized Collatz sequence

Question

I've posted the same question in CodeReview but failed to get an answer. so I am trying my luck here in SO.

Here is one of my programs that utilized memoization and array to improve performance and memory usage. The performance seems satisfactory but the memory usage is ridiculous and I can't figure out what's wrong:

{-# LANGUAGE BangPatterns #-}
import Data.Functor
import Data.Array (Array)
import qualified Data.Array as Arr
import Control.DeepSeq

genColtzArr n = collatzArr
    where collatzArr = Arr.array (1, n) $ take n $ map (\v -> (v, collatz v 0)) [1..] 
          collatz 1 !acc  = 1 + acc
          collatz !m !acc
              | even m    = go (m `div` 2) acc
              | otherwise = go (3 * m + 1) acc
              where go !l !acc
                      | l <= n    = let !v = collatzArr Arr.! l in 1 + acc + v
                      | otherwise = collatz l $ 1 + acc

collatz here means this guy. This function is supposed to receive a number n, and then return an array indexing from 1 to n, and in which each cell contains the length of the link from the index to 1 by applying Collatz formula.

But the memory usage of this method is so high. Here is the profiler result (ghc option -prof -fprof-auto -rtsopts, run time option +RTS -p, n == 500000):

total alloc = 730,636,136 bytes  (excludes profiling overheads)

COST CENTRE              MODULE  %time %alloc

genColtzArr.collatz      Main     40.4   34.7
genColtzArr.collatz.go   Main     25.5   14.4


COST CENTRE                      MODULE                    no.     entries  %time %alloc   %time %alloc     

      genColtzArr                Main                      105           1    0.0    0.0    74.7   72.1
       genColtzArr.collatzArr    Main                      106           1    8.0   20.8    74.7   72.1
        genColtzArr.collatzArr.\ Main                      107      500000    0.9    2.2    66.8   51.3
         genColtzArr.collatz     Main                      109     1182582   40.4   34.7    65.9   49.1
          genColtzArr.collatz.go Main                      110     1182581   25.5   14.4    25.5   14.4

Please note that -O2 is not a desired answer. I want to figure out what's the problem in this program and in general, how should I spot time and memory inefficiencies in Haskell code. Specifically, I have no idea why this code, with tail recursion and bang pattern, can consume so much memory.

UPDATE1:

the same code with -s produces this:

   1,347,869,264 bytes allocated in the heap
     595,901,528 bytes copied during GC
     172,105,056 bytes maximum residency (7 sample(s))
         897,704 bytes maximum slop
             315 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      2408 colls,     0 par    0.412s   0.427s     0.0002s    0.0075s
  Gen  1         7 colls,     0 par    0.440s   0.531s     0.0759s    0.1835s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    0.828s  (  0.816s elapsed)
  GC      time    0.852s  (  0.958s elapsed)
  RP      time    0.000s  (  0.000s elapsed)
  PROF    time    0.000s  (  0.000s elapsed)
  EXIT    time    0.004s  (  0.017s elapsed)
  Total   time    1.684s  (  1.791s elapsed)

  %GC     time      50.6%  (53.5% elapsed)

  Alloc rate    1,627,861,429 bytes per MUT second

  Productivity  49.4% of total user, 46.4% of total elapsed

so it takes 300 meg. that is still too large.

Update2

full code

{-# LANGUAGE BangPatterns #-}
import Data.Functor
import Data.Array (Array)
import qualified Data.Array as Arr
import Control.DeepSeq

genColtzArr n = collatzArr
    where collatzArr = Arr.array (1, n) $ take n $ map (\v -> (v, collatz v 0)) [1..] 
          collatz 1 !acc  = 1 + acc
          collatz !m !acc
              | even m    = go (m `div` 2) acc
              | otherwise = go (3 * m + 1) acc
              where go !l !acc
                      | l <= n    = let !v = collatzArr Arr.! l in 1 + acc + v
                      | otherwise = collatz l $ 1 + acc


genLongestArr n = Arr.array (1, n) llist
    where colatz = genColtzArr n
          llist  = (1, 1):zipWith (\(n1, a1) l2 -> 
                                    let l1 = colatz Arr.! a1
                                     in (n1 + 1, if l2 < l1 then a1 else n1 + 1)) 
                                  llist (tail $ Arr.elems colatz)


main :: IO ()
main = getLine >> do
    ns <- map read <$> lines <$> getContents
    let m          = maximum ns
    let lar        = genLongestArr m
    let iter []    = return ()
        iter (h:t) = (putStrLn $ show $ lar Arr.! h) >> iter t
    iter ns

Memoized Collatz sequence

Answers (1)

Solution

Related Questions