firefrorefiddle
firefrorefiddle

Reputation: 3805

Haskell: Can a function be compiled?

Consider a simple Haskell Brainf*ck interpreter. Just look at the interpret function.

import Prelude hiding (Either(..))
import Control.Monad
import Data.Char (ord, chr)

-- function in question
interpret :: String -> IO ()
interpret strprog = let (prog, []) = parse strprog
                    in execBF prog


interpretFile :: FilePath -> IO ()
interpretFile fp = readFile fp >>= interpret


type BF = [BFInstr]
data BFInstr = Left | Right | Inc | Dec | Input | Output | Loop BF

type Tape = ([Integer], [Integer])

emptyTape = (repeat 0, repeat 0)

execBFTape :: Tape -> BF -> IO Tape
execBFTape = foldM doBF

execBF :: BF -> IO ()
execBF prog = do
  execBFTape emptyTape prog
  return ()

doBF :: Tape -> BFInstr -> IO Tape
doBF ((x:lefts),   rights)   Left     = return (lefts,         x:rights)
doBF (lefts,    (x:rights))  Right    = return (x:lefts,         rights)
doBF (left,     (x:rights))  Inc      = return (left,      (x+1):rights)
doBF (left,     (x:rights))  Dec      = return (left,      (x-1):rights)
doBF (left,     (_:rights))  Input    = getChar >>= \c -> return (left, fromIntegral (ord c):rights)
doBF t@(_,      (x:     _))  Output   = putChar (chr (fromIntegral x)) >> return t
doBF t@(left,   (x:     _)) (Loop bf) = if x == 0
                                        then return t
                                        else do t' <- execBFTape t bf
                                                doBF t' (Loop bf)

simpleCommands = [('<', Left),
                  ('>', Right),
                  (',', Input),
                  ('.', Output),
                  ('+', Inc),
                  ('-', Dec)]

parse :: String -> (BF, String)
parse []          = ([], [])
parse (char:prog) = case lookup char simpleCommands of
                      Just command -> let (rest, prog') = parse prog
                                      in (command : rest, prog')
                      Nothing      ->
                        case char of
                          ']' -> ([], prog)
                          '[' -> let (loop, prog')  = parse prog
                                     (rest, prog'') = parse prog'
                                 in (Loop loop:rest, prog'')
                          _   -> parse prog

So I have a function applied like interpret "[->+<]". This gives me an IO () monadic action which executes the given program. It has the right type to be a main of some program.

Let's say I would like to have this action compiled to an executable, that is, I would like to generate an executable file with the result of interpret ... to be the main function. Of course, this executable would have to contain the GHC runtime system (for infinite lists, integer arithmetic etc.).

Questions:

  1. It is my opinion that it is not possible at all to just take the monadic action and save it to be a new file. Is this true?
  2. How could one go about reaching a comparable solution? Do the GHC Api and hint help?

EDIT

Sorry, I oversimplified in the original question. Of course, I can just write a file like this:

main = interpret "..."

But this is not what we usually do when we try to compile something, so consider interpretFile :: FilePath -> IO () instead. Let the BF program be saved in a file (helloworld.bf).

How would I go about creating an executable which executes the contents of helloworld.bf without actually needing the file?

$ ./MyBfCompiler helloworld.bf -o helloworld

Upvotes: 0

Views: 430

Answers (2)

Davislor
Davislor

Reputation: 15124

I’m not sure whether you’re asking how you write a compiler that can take as its input a file such as helloworld.bf, or how you compile a Haskell program that runs helloworld.bf.

In the former case, you would want something a little more fleshed out than this:

import System.Environment (getArgs)

main :: IO ()
main = do
  (_:fileName:_) <- getArgs
  source <- readFile fileName
  interpret source

interpret :: String -> IO ()
interpret = undefined -- You can fill in this piddly little detail yourself.

If you want the latter, there are a few different options. First, you can store the contents of your *.bf file in a string constant (or bettter yet, a Text or strict ByteString), and pass that to your interpreter function. I’d be surprised if GHC is optimistic enough to fully inline and expand that call at compile time, but in principle a Haskell compiler could.

The second is to turn Brainfuck into a domain-specific language with operators you define, so that you can actually write something like

interpret [^<,^+,^>,^.]

If you define (^<) and the other operators, the Brainfuck commands will compile to bytecode representing the Brainfuck program.

In this case, there isn’t an obvious benefit over the first approach, but with a more structured language, you can do an optimization pass, compile the source to stack-based bytecode more suitable for an interpreter to execute, or generate a more complex AST.

You might also express this idea as

interpret
  (^< ^+ ^> ^.)
  input

Here, if the Brainfuck commands are higher-order functions with right-to-left precedence, and interpret bf input = (bf begin) input, the Brainfuck code would simply compile to a function that the interpreter calls. This has the best chance of being turned into fast native code.

Previous Answer

In certain cases, a compiler can inline a function call (there are pragmas in GHC to tell it to do this). The compiler is also more likely to do what you want if you name the closure, such as:

main = interpret foo

In GHC, you can give the compiler a hint by adding

{-# INLINE main #-}

or even

{-# INLINE interpret #-}

You can check what code GHC generated by compiling the module with -S and looking through the source.

Upvotes: 2

Dan Robertson
Dan Robertson

Reputation: 4360

The answer is basically no.

There are many ways to construct IO values:

  1. Built in functions like putStrLn
  2. Monad operations like return or >>=

Once you have an IO value there are three ways to break it down:

  1. Set main equal to the value
  2. unsafePerformIO
  3. As the return value of an exported C function

All of these break down into converting an IO a into an a. There is no other way to inspect it to see what it does.

Similarly the only thing you can do with functions is put them in variables or call them (or convert them to C function pointers).

There is no sane way to otherwise inspect a function.

One thing you could do which isn’t compiling but is linking is to have your interpreter main function run on some external c string, build that into a static object, and then your “compiler” could make a new object with this C string of the program in it and link that to what you already have.


There is this theory of partial evaluation that says that if you do partial evaluation of a partial evaluator applied to an interpreter applied to some input then what you get is a compiler but ghc is not a sufficiently advanced partial evaluator.

Upvotes: 2

Related Questions