Helix
Helix

Reputation: 191

Using list generator for memory efficient code in Haskell

I would like to get a handle on writing memory efficient haskell code. One thing I ran across is that there is no dead easy way to make python style list generators/iterators (that I could find).

Small example:

Finding the sum of the integers from 1 to 100000000 without using the closed form formula.

Python that can be done quickly with minimal use of memory as sum(xrange(100000000). In Haskell the analogue would be sum [1..100000000]. However this uses up a lot of memory. I thought using foldl or foldr would be fine but even that uses a lot of memory and is slower than python. Any suggestions?

Upvotes: 4

Views: 314

Answers (1)

epsilonhalbe
epsilonhalbe

Reputation: 15959

TL;DR - I think the culprit in this case is - defaulting of GHC to Integer.

Admittedly I do not know enough about python, but my first guess would be that python switches to "bigint" only if necessary - therefore all calculations are done with Int a.k.a. 64-bit integer on my machine.

A first check with

$> ghci
GHCi, version 7.10.3: http://www.haskell.org/ghc/  :? for help
Prelude> maxBound :: Int
9223372036854775807

reveals that the result of the sum (5000000050000000) is less than that number so we have no fear of Int overflow.

I have guessed your example programs to look roughly like this

sum.py

print(sum(xrange(100000000)))

sum.hs

main :: IO ()
main = print $ sum [1..100000000]

To make things explicit I added the type annotation (100000000 :: Integer), compiling it with

$ > stack build --ghc-options="-O2 -with-rtsopts=-sstderr"

and ran your example,

$ > stack exec -- time sum
5000000050000000
   3,200,051,872 bytes allocated in the heap
         208,896 bytes copied during GC
          44,312 bytes maximum residency (2 sample(s))
          21,224 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      6102 colls,     0 par    0.013s   0.012s     0.0000s    0.0000s
  Gen  1         2 colls,     0 par    0.000s   0.000s     0.0001s    0.0001s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    1.725s  (  1.724s elapsed)
  GC      time    0.013s  (  0.012s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time    1.739s  (  1.736s elapsed)

  %GC     time       0.7%  (0.7% elapsed)

  Alloc rate    1,855,603,449 bytes per MUT second

  Productivity  99.3% of total user, 99.4% of total elapsed

1.72user 0.00system 0:01.73elapsed 99%CPU (0avgtext+0avgdata 4112maxresident)k

and indeed the ~3GB of memory consumption is reproduced.

Changing the annotation to (100000000 :: Int) - altered the behaviour drastically

$ > stack build
$ > stack exec -- time sum
5000000050000000
          51,872 bytes allocated in the heap
           3,408 bytes copied during GC
          44,312 bytes maximum residency (1 sample(s))
          17,128 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0         0 colls,     0 par    0.000s   0.000s     0.0000s    0.0000s
  Gen  1         1 colls,     0 par    0.000s   0.000s     0.0001s    0.0001s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    0.034s  (  0.034s elapsed)
  GC      time    0.000s  (  0.000s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time    0.036s  (  0.035s elapsed)

  %GC     time       0.2%  (0.2% elapsed)

  Alloc rate    1,514,680 bytes per MUT second

  Productivity  99.4% of total user, 102.3% of total elapsed

0.03user 0.00system 0:00.03elapsed 91%CPU (0avgtext+0avgdata 3496maxresident)k
0inputs+0outputs (0major+176minor)pagefaults 0swaps

For the interested

The behaviour of the haskell version does not change a lot if you use libraries like conduit or vector (both boxed and unboxed).

sample programs

sumC.hs

import Data.Conduit
import Data.Conduit.List as CL

main :: IO ()
main = do res <- CL.enumFromTo 1 100000000 $$ CL.fold (+) (0 :: Int)
          print res

sumV.hs

import           Data.Vector.Unboxed as V
{-import           Data.Vector as V-}

main :: IO ()
main = print $ V.sum $ V.enumFromTo (1::Int) 100000000

funny enough the version using

main = print $ V.sum $ V.enumFromN (1::Int) 100000000

is doing worse than the above - even though the documentation says otherwise.

enumFromN :: (Unbox a, Num a) => a -> Int -> Vector a

O(n) Yield a vector of the given length containing the values x, x+1 etc. This operation is usually more efficient than enumFromTo.

Update

@Carsten's comment made me curious - so I had a look into the sources for integer - well integer-simple to be precise, because for Integer there exists other versions integer-gmp and integer-gmp2 using libgmp.

data Integer = Positive !Positive | Negative !Positive | Naught

-------------------------------------------------------------------
-- The hard work is done on positive numbers

-- Least significant bit is first

-- Positive's have the property that they contain at least one Bit,
-- and their last Bit is One.
type Positive = Digits
type Positives = List Positive

data Digits = Some !Digit !Digits
            | None
type Digit = Word#

data List a = Nil | Cons a (List a)

so when using Integer there is quite a bit of memory overhead compared to Int or rather unboxed Int# - I guess as this should be optimized, (though I have not confirmed that).

So Integer is (if I calculate correctly)

  • 1 x Word for the sum-type-tag (here Positive
  • n x (Word + Word) for Some and the Digit part
  • 1 x Word for the last None

a memory overhead of (2 + floor(log_10(n)) for each Integer in that calculation + a bit more for the accumulator.

Upvotes: 6

Related Questions