Reputation: 22520
I understand that I should not try to re-read from stdin
because of errors about Haskell IO - handle closed
For example, in below:
main = do
x <- getContents
putStrLn $ map id x
x <- getContents --problem line
putStrLn x
the second call x <- getContents
will cause the error:
test: <stdin>: hGetContents: illegal operation (handle is closed)
Of course, I can omit the second line to read from getContents
.
main = do
x <- getContents
putStrLn $ map id x
putStrLn x
But will this become a performance/memory issue? Will GHC have to keep all of the contents read from stdin
in the main memory?
I imagine the first time around when x
is consumed, GHC can throw away the portions of x
that are already processed. So theoretically, GHC could only use a small amount of constant memory for the processing. But since we are going to use x
again (and again), it seems that GHC cannot throw away anything. (Nor can it read again from stdin
).
Is my understanding about the memory implications here correct? And if so, is there a fix?
Upvotes: 1
Views: 540
Reputation: 85827
Yes, your understanding is correct: If you reuse x
, ghc has to keep it all in memory.
I think a possible fix is to consume it lazily (once).
Let's say you want to output x
to several output handles hdls :: [Handle]
. The naive approach is:
main :: IO ()
main = do
x <- getContents
forM_ hdls $ \hdl -> do
hPutStr hdl x
This will read stdin
into x
as the first hPutStr
traverses the string (at least for unbuffered handles, hPutStr
is simply a loop that calls hPutChar
for each character in the string). From then on it'll be kept in memory for all following hdl
s.
Alternatively:
main :: IO ()
main = do
x <- getContents
forM_ x $ \c -> do
forM_ hdls $ \hdl -> do
hPutChar hdl c
Here we've transposed the loops: Instead of iterating over the handles (and for each handle iterating over the input characters), we iterate over the input characters, and for each character, we print it to each handle.
I haven't tested it, but this form should guarantee that we don't need a lot of memory because each input character c
is used once and then discarded.
Upvotes: 2