alexpbell
alexpbell

Reputation: 85

Why do we need IO?

In Tackling the Awkward Squad: monadic input/output, concurrency, exceptions, and foreign-language calls in Haskell, SPJ states:

For example, perhaps the functional program could be a function mapping an input character string to an output string:

main :: String -> String

Now a "wrapper" program, written in (gasp!) C, can get an input string from somewhere [...] apply the function to it, and store the result somewhere [...]

He then goes on to say that this locates the "sinfulness" in the wrapper, and that the trouble with this approach is that one sin leads to another (e.g. more than one input, deleting files, opening sockets etc).

This seems odd to me. I would have thought Haskell would be most powerful, and possibly even most useful, when approached exactly in this fashion. That is, the input is a character string located in a file, and output is a new character string in a new file. If the input string is some mathematical expression concatenated with data, and the output string is (non-Haskell) code, then you can Get Things Done. In other words, why not always treat Haskell programs as translators? (Or as a compiler, but as a translator you can blend genuine I/O into the final executable.)

Regardless of the wisdom of this as a general strategy (I appreciate that some things we might want to get done may not start out as mathematics), my real question is: if this is indeed the approach, can we avoid the IO a type? Do we need a wrapper in some other language? Does anyone actually do this?

Upvotes: 0

Views: 299

Answers (3)

Davislor
Davislor

Reputation: 15124

That wrapper exists. It’s called Prelude.interact. I use the Data.ByteString versions of it often. The wrapper to a pure String -> String function works, because strings are lazily-evaluated singly-linked lists that can process each line of input as it’s read in, but singly-linked lists of UCS-4 characters are a very inefficient data structure.

You still need to use IO for the wrapper because the operations depend on the state of the universe and need to be sequenced with the outside world. In particular, if your program is interactive, you want it to respond to a new keyboard command immediately, and run all the OS system calls in sequence, not (say) process all the input and display all the output at once when it’s ready to quit the program.

A simple program to demonstrate this is:

module Main where
import Data.Char (toUpper)

main :: IO ()
main = interact (map toUpper)

Try running this interactively. Type control-D to quit on Linux or the MacOS console, and control-Z to quit on Windows.

As I mentioned before, though String is not an efficient data structure at all. For a more complicated example, here is the Main module of a program I wrote to normalize UTF-8 input to NFC form.

module Main ( lazyNormalize, main ) where

    import Data.ByteString.Lazy as BL ( fromChunks, interact )
    import Data.Text.Encoding as E (encodeUtf8)
    import Data.Text.Lazy as TL (toChunks)
    import Data.Text.Lazy.Encoding as LE (decodeUtf8)
    import Data.Text.ICU ( NormalizationMode (NFC) )
    import TextLazyICUNormalize (lazyNormalize)
    
    main :: IO ()
    main = BL.interact (
             BL.fromChunks .
             map E.encodeUtf8 .
             TL.toChunks . -- Workaround for buffer not always flushing on newline.
             lazyNormalize NFC .
             LE.decodeUtf8 )

This is an Data.Bytestring.Lazy.interact wrapper around a Data.Text.Lazy.Text -> Data.Text.Lazy.Text function, lazyNormalize with the NormalizationMode constant NFC as its curried first argument. Everything else just converts from the lazy ByteString strings I use to do I/O to the lazy Text strings the ICU library understands, and back. You will probably see more programs written with the & operator than in this point-free style.

Upvotes: 1

atravers
atravers

Reputation: 495

There are multiple questions here:

  1. If the input string is some mathematical expression concatenated with data, and the output string is (non-Haskell) code, then you can Get Things Done. In other words, why not always treat Haskell programs as translators? ([... because] as a translator you can blend genuine I/O into the final executable.)

  2. Regardless of the wisdom of this as a general strategy [...] if this is indeed the approach, can we avoid the IO a type?

  3. Do we need a wrapper in some other language?

  4. Does anyone actually do this?


  1. Using your informal description, main would have a type signature resembling:

    main :: (InputExpr, InputData) -> OutputCode

    If we drop the InputData component:

    main :: InputExpr -> OutputCode

    then (as you noted) main really does look like a translator...but we already have programs for doing that - compilers!

    While task-specific translators have their place, using them everywhere when we already have general-purpose ones seems somewhat redundant...

  2. ...so it's use as as a general strategy is speculative at best. This is probably why such an approach was never officially "built into" Haskell (instead being implemented using Haskell's I/O facilities e.g. interact for simple string-to-string interactions).

    As for getting by without the monadic IO type - for a single-purpose language like Dhall, it could be possible...but can that technique also be used to build webservers or operating systems? (That is left as an exercise for intrepid readers :-)

  3. In theory: no - the various Lisp Machines being the canonical example.

    In practice: yes - as the variety of microprocessor and assembly languages grows ever larger, portably defining the essential runtime services (parallelism, GC, I/O, etc) in an existing programming language these days is a necessity.

  4. Not exactly - perhaps the closest to what you're described is (again) the string-to-string interactivity supported in the venerable Lazy ML system from Chalmers or the original version of Miranda(R).

Upvotes: 0

Carl
Carl

Reputation: 27003

The point is that String -> String is a rather poor model of what programs do in general.

What if you're writing an http server that accepts concurrent pipelined requests and responds to each pipeline concurrently while also interleaving writes from a response in a pipeline along with reads for the next request? This is the level of concurrency that http servers work at.

Maybe, just maybe, you can stuff that into a String -> String program. You could multiplex the pipelines into your single channel. But what about timeouts? Web servers time out connections that trickle in, to prevent slow loris attacks. How are you even going to account for that? Maybe your input string has a sequence of timestamps added at regular intervals, regardless of other inputs? Oh, but what about the variant where the recipient only reads from their receive buffer in trickles? How can you even tell that you are blocked waiting for the send buffer to drain?

If you pursue all the potential problems and stuff them into a String -> String program, you eventually end up with almost all the interesting parts of the server existing outside your haskell program. After all, something has to do the multiplexing, has to do the error detection and reporting, has to do timeouts. If you're writing an http server in Haskell, it would be nice if it was actually written in Haskell.

Of course, none of this means the IO type as it currently exists is the best possible answer. There are reasonable complaints that can be made about it. But it at least allows you to address all of those issues within Haskell.

Upvotes: 5

Related Questions