Charu
Charu

Reputation: 2727

How do you go about implementing a interpreter(in Haskell) for a simple programming language which is an imperative-style language

I have been given with the language semantics and everything i should know.It'd only support few operations and there wouldn't be any concept of Data Types. So i can store anything in a variable and operate on them.

I'd have loops and conditions and function calls and that is it. I am looking for a start, an example not a theory book. Has anyone ever implemented such a basic language interpreter in Haskell? I am looking for pointers and references.

Thanks !

Upvotes: 2

Views: 2041

Answers (4)

user1706754
user1706754

Reputation: 127

Here is some resource https://github.com/budabudimir/imp_interpreter, this is interpreter for Simple Imperative Language described here http://fsl.cs.illinois.edu/images/0/0d/CS522-Spring-2011-PL-book-imp.pdf

Upvotes: 1

Wyzard
Wyzard

Reputation: 34591

I'm working on one right now as a practice project.

It's a dynamically-typed language, so variables don't have to be declared, but each value has an associated type. I implemented that using an algebraic data type in Haskell:

data Value = BoolValue Bool  -- ^ A Boolean value.
           | NumberValue Double  -- ^ A numeric value.
           | StringValue String  -- ^ A string value.
           -- (several others omitted for simplicity)

For execution of programs, I'm using the StateT and ErrorT monad transformers on top of IO:

-- | A monad representing a step in an RPL program.
--
-- This type is an instance of 'MonadState', so each action is a function that
-- takes an 'RPLContext' as input and produces a (potentially different)
-- 'RPLContext' as its result.  It is also an instance of 'MonadError', so an
-- action may fail (with 'throwRPLError').  And it is built on the 'IO' monad,
-- so 'RPL' computations can interact with the outside world.
type RPL = StateT RPLContext (ErrorT RPLError IO)

-- | Executes an 'RPL' computation.
-- The monadic result value (of type @a@) is discarded, leaving only the final
-- 'RPLContext'.
runRPL :: RPL a  -- ^ The computation to run
       -> RPLContext  -- ^ The computation's initial context
       -> IO (Either RPLError RPLContext)
       -- ^ An 'IO' action that performs the operation, producing either
       -- a modified context if it succeeds, or an error if it fails.
runRPL a = runErrorT . (execStateT a)

The "context" is a combination of a data stack (it's a stack-based language) and an "environment" that holds all the variables that are currently in scope:

-- | The monadic state held by an 'RPL' computation.
data RPLContext = RPLContext {
  contextStack :: Stack,  -- ^ The context's data stack.
  contextEnv :: Env  -- ^ The context's environment.
}

(Note that Stack is just an alias for [Value].)

On top of that foundation, I have a variety of helper functions to do things like manipulate the stack in the current context (held by the StateT part of the RPL monad). For example, here are the functions involved in pushing a value onto the stack:

-- | Pushes a value onto the stack.
pushValue :: Value -> RPL ()
pushValue x = modifyStack (x:)

-- | Transforms the current stack by a function.
modifyStack :: (Stack -> Stack) -> RPL ()
modifyStack f = do
  stack <- getStack
  putStack $ f stack

-- | Returns the entire current stack.
getStack :: RPL Stack
getStack = fmap contextStack get

-- | Replaces the entire current stack with a new one.
putStack :: Stack -> RPL ()
putStack stack = do
  context <- get
  put $ context { contextStack = stack }

getStack, putStack, and modifyStack are modeled after MonadState's get, put, and modify functions, but they operate on just one field of the RPLContext record.

All the language's built-in commands are just actions in the RPL monad, which build on top of tools like pushValue.

For parsing code in my language, I'm using Parsec. It's pretty nice.


On a separate track, unrelated to my RPL interpreter, you might find "Write Yourself a Scheme in 48 Hours" helpful.

Upvotes: 5

ertes
ertes

Reputation: 51

I would first encode the entire program in an EDSL. That EDSL would itself be a monad and resemble IO. A GADT makes this very easy to encode:

{-# LANGUAGE GADTs, KindSignatures #-}

module Interp where

import SomeStuff


data Expr :: * -> * where
    -- Commands
    Print   :: String -> Expr ()
    GetLine :: Expr String

    -- Variables (created on demand)
    GetVar :: Name -> Expr Value
    SetVar :: Name -> Value -> Expr ()

    -- Loop constructs
    While :: Expr Bool -> Expr a -> Expr ()
    For   :: Expr a -> Expr Bool -> Expr b -> Expr c -> Expr ()

    -- Expr is a monad
    Return :: a -> Expr a
    Bind   :: Expr a -> (a -> Expr b) -> Expr b

instance Monad Expr where
    return = Return
    (>>=)  = Bind

runExpr :: Expr a -> StateT Variables IO a
runExpr (Print str) = liftIO (putStrLn str)
runExpr GetLine     = liftIO getLine
runExpr (While p x) =
    fix $ \again -> do
        b <- runExpr p
        when b (runExpr x >> again)
runExpr ...

For simple languages you can even do something as simple as this without a dedicated EDSL:

parseProgram :: Parser (StateT Variables IO ())
parseProgram = ...

It's often forgotten that Haskell takes the notion of functional programming to its conclusion. Let the parser return the program itself. Then you just need to runStateT it with a suitable starting state.

Upvotes: 5

Joey Adams
Joey Adams

Reputation: 43400

One way would be to have your interpreter run in a StateT monad, using a Map to emulate mutable variables. Simple example:

import Control.Monad.State
import Data.Map (Map)
import qualified Data.Map as Map

type VarName = String
data Value = VInt Int
           | VString String
type InterpreterState = Map VarName Value

type InterpretM = StateT InterpreterState IO

putVar :: VarName -> Value -> InterpretM ()
putVar varname value = modify (Map.insert varname value)

getVar :: VarName -> InterpretM Value
getVar varname = do
    m <- gets (Map.lookup varname)
    case m of
        Just x  -> return x
        Nothing -> error $ "Variable " ++ varname ++ " is undefined"

The interpreter would then run in the InterpretM monad. The above accessors give it access to mutable variables (with no support for goodness like closures and lexical scope).

Upvotes: 2

Related Questions