misterbee
misterbee

Reputation: 5182

Efficiently print an Array to text in Haskell (bogus, sorry)

Update 2: This question is bogus. I apologize. I added some code to print an array, skipping the rest of my program logic, and it is nearly instantaneous. What I though was slow array printing must have been slow {something} that was lazily computed after the array-print function was entered.

Update 1: The snippet below does not faithfully reproduce the performance problem in my original problem. I must have misattributed the source of the slowness, either due to mis-tracking lazuy evaulation, or some other change that I made when extricating the small snippet from the larger program. I will try to create a self-contained reproduction of the slow processing.

Summary: Formatting an Unboxed Array of Int32 to String, and printing the resulting ~4MB of text to disk, is inefficient, about 1 minute to print 1 million values (60K CPU cycles per value). Why? Can it be made faster?

Details:

Running as a compiled program with "-O2" compiler flag. GHC 7.0.4, Haskell Platform 2011.04, Windows 7 x64.

I have a Haskell program that builds an array of size 100K to 1 Million Int32 values.

The Int32 values are actually Int8 4-tuple values (Int8, Int8, Int8, Int8) packed into an Int32 in a (perhaps misguided) attempt at efficiency.

I am using STUArray (but that might be a poor choice) to store and mutate the array.

At certain points in my program, I print the array to disk, in this form:

byte byte byte ...

where each byte is an ASCII representation of a decimate value of a byte ("0" to "255"). Example:

0 255 128 5 31 ...

My problem is that the IO action of printing the array to a Handle is slow: It takes about a minute to format print the ~1Million bytes to text on the file handle, on a modern 3.3GHz i5 CPU and SSD disk.

My printing code is basically this:

import Data.Array.Unboxed
import System.IO
import Contol.Monad
import Data.Bits

printArray handle frozenArray = mapM_ (hPutStr handle . showPacked . (frozenArray !)) [0..arrayLength]

showPacked x = (' ':) .  (shows $ (shift x (-24)) .&. 255) 
           . (' ':) .  (shows $ (shift x (-16)) .&. 255)
           . (' ':) .  (shows $ (shift x  (-8)) .&. 255) 
           . (' ':) .  (shows $        x        .&. 255)
           $ ""

Is there a better way?

What's likely to be the source of the problem? I have a few guesses (in decreasing order of likelihood):

Potential red herring warning: There is some risk that the delay I see is related to lazy computation that is forced only when I print, and not caused by formating/printing itself, but I don't think so, because even when I print the array in the first iteration (after creating an initial blank array), it's still slow. And I can watch the output file grow slowly, so a lot of work is happening after the first byte is printed, even using a strict unboxed array.

Upvotes: 1

Views: 690

Answers (2)

Daniel Fischer
Daniel Fischer

Reputation: 183978

What's likely to be the source of the problem? I have a few guesses (in decreasing order of likelihood):

  • Formatting each Int32 with my code is inefficient.

So-so. I'm not sure how well GHC optimises shift x (-24), maybe using shiftR x 24 is better, but I need to look at the generated core to know. Edit: perfectly fine, produces uncheckedIShiftRA#, as one could hope for.

  • mapM_ over the array with (!) inefficient.

Yes, a bit. Using elems or unsafeAt is slightly faster due to the omission of bounds-checks.

  • Calling hPutStr a million times is inefficient.

Definitely. But has less impact than I thought.

  • 1 minute to print 1 million values actually is efficient.

No:

Prelude> writeFile "IntList.txt" $ unlines . map show $ [1 :: Int .. 1048576]
(0.37 secs, 836888976 bytes)

It's far more efficient - and may well be more efficient than using an intermediate ByteString - to produce one long String which is printed to the handle at once. The String will be lazily generated, so it will not use much memory, and the generating of the String is pretty efficient (still tweakable if need be).

Going off to measure now.

Using

printArray handle frozenArray = hPutStr handle $ showit (elems frozenArray)
  where
    showit :: [Int32] -> String
    showit (n:ns) = (' ' :) . shows ((n `shiftR` 24) .&. 255)
                  . (' ' :) . shows ((n `shiftR` 16) .&. 255)
                  . (' ' :) . shows ((n `shiftR` 8) .&. 255)
                  . (' ' :) . shows (n .&. 255) $ showit ns
    showit [] = ""

to print the values brought the time to process [0 .. 1048575] from 0.9 seconds to 0.4 seconds. Allocation went down from 2,031,716,528 bytes to 779,086,856 bytes, but memory usage went up from 6MB to 11MB and bytes copied during GC from ~1M to ~5.5M, however, restricting the heap size -M11M to force earlier GC brought memory back down to 6MB and bytes copied during GC to 676,744. (ghc-7.2.2)

Upvotes: 1

MathematicalOrchid
MathematicalOrchid

Reputation: 62848

  1. Do you really want to do that? Do you actually need this stuff in textual form, or would some kind of binary format be OK? Because if so, that's likely to be much faster...

  2. If you really do need it to be decimal text, read on.

My first instinct is to avoid doing I/O with String values. Instead, try building the entire result string inside a lazy ByteString, and then write the whole lot to disk in one I/O call. As a benefit, the code becomes purer. (Assuming your array really is frozen at the moment you're trying to save it to disk.) You can then let the lazy ByteString engine lazily generate the text and do fancy buffering and stuff.

Unfortunately, I don't know of any way to get directly from an integer to a textual representation in a ByteString without using show to go via a String. But you may still see some benefit from it. Or not...

Upvotes: 2

Related Questions