Reputation: 3086
I have a simple problem: Given a list of integers, read the first line as N
. Then, read the next N lines and return the sum of them. Repeat until N
= 0.
My first approach was using this:
main = interact $ unlines . f . (map read) . lines
f::[Int] -> [String]
f (n:ls)
| n == 0 = []
| otherwise = [show rr] ++ (f rest)
where (xs, rest) = splitAt n ls
rr = sum xs
f _ = []
But it's relatively slow. I've profiled it using
ghc -O2 --make test.hs -prof -auto-all -caf-all -fforce-recomp -rtsopts
time ./test +RTS -hc -p -i0.001 < input.in
Where input.in
is a test input where the first line is 100k, followed by 100k random numbers, followed by 0. We can see in the Figure below that it's using O(N) memory:
EDITED: My original question was comparing 2 similarly slow approaches. I've updated it to compare with an optimized approach below
Now, if I do the sum iteratively, instead of calling sum
, I get constant amount of memory
{-# LANGUAGE BangPatterns #-}
main = interact $ unlines . g . (map read) . lines
g::[Int] -> [String]
g (n:ls)
| n == 0 = []
| otherwise = g' n ls 0
g _ = []
g' n (l:ls) !cnt
| n == 0 = [show cnt] ++ (g (l:ls))
| otherwise = g' (n-1) ls (cnt + l)
I'm trying to understand what is causing the performance loss in the first example. I would guess everything there could be lazily evaluated?
Upvotes: 4
Views: 141
Reputation: 18189
I think that laziness is what prevents your first and second version from being equivalent.
Consider the result created from the input "numbers"
1
garbage_here
2
3
5
0
The first version would give a result list [error "...some parse error", 8], which you can safely look at the second element of, while the second version errors near immediately. It seems hard to achieve the first in a streaming way.
Even without laziness, though, getting from the first to the second version may be more than GHC can handle - it would need to have fusion rewriting rules combining foldl/foldl'
on the first element of a tuple with splitAt
. And GHC has only recently got to the point where it can fuse foldl/foldl'
at all.
Upvotes: 1
Reputation: 62818
I don't know precisely what is causing the difference. But I can show you this:
Data.Map> sum [1 .. 1e8]
Out of memory.
Data.Map> foldl' (+) 0 [1 .. 1e8]
5.00000005e15
For some reason, sum = foldl (+) 0
, rather than foldl'
(with the apostrophe). The difference is that the latter function is more strict, so it uses virtually no memory. The lazy version, by contrast, does this:
sum [1..100]
1 + sum [2..100]
1 + 2 + sum [3..100]
1 + 2 + 3 + sum [4.100]
...
In other words, it creates a giant expression that says 1 + 2 + 3 + ... And then, right at the end, it tries to evaluate it all. Well, obviously, that's going to eat a lot of RAM. By using foldl'
instead of foldl
, you make it do the additions immediately, rather than pointlessly storing them in RAM.
You probably also want to do I/O using ByteString
rather than String
; but the laziness difference will probably give you a big speed boost on its own.
Upvotes: 3