Reputation: 365
The following is an example program of a language in which I'm writing a parser.
n := 1
Do (1)-> -- The 1 in brackets is a placeholder for a Boolean or relational expression.
n := 1 + 1
Od
When the program looks like this, the parseFile
functions ends after the assignment on the first line however when the assignment is removed, it parses as expected. Below is how it's called in GHCI
, first with the first line present then removed:
λ > parseFile "example.hnry"
Assign "n" (HInteger 1)
λ > parseFile "example.hnry"
Do (HInteger 1) (Assign "n" (AExpr (HInteger 1) Add (HInteger 1)))
The expected output would look similar to this:
λ > parseFile "example.hnry"
Assign "n" (HInteger 1) Do (HInteger 1) (Assign "n" (AExpr (HInteger 1) Add (HInteger 1)))
I first assumed it was something to do with the the assignment parser but in the body of the loop, there exists an assignment which parses as expected so I was able to rule that out. I believe that the issue is within the parseFile
function itself. The following is the parseFile
function and the other functions that make up the parseExpression
function that I'm using to parse a program.
I think that the error is within parseFile
because it parses an expression only once and doesn't "loop" for the want of a better word to itself to check if there's more input left the parse. I think that's the error but I'm not quite sure.
parseFile :: String -> IO HVal
parseFile file =
do program <- readFile file
case parse parseExpression "" program of
Left err -> fail "Parse Error"
Right parsed -> return $ parsed
parseExpression :: Parser HVal
parseExpression = parseAExpr <|> parseDo <|> parseAssign
parseDo :: Parser HVal
parseDo = do
_ <- string "Do "
_ <- char '('
x <- parseHVal -- Will be changed to a Boolean expression
_ <- string ")->"
spaces
y <- parseExpression
spaces
_ <- string "Od"
return $ Do x y
parseAExpr :: Parser HVal
parseAExpr = do
x <- parseInteger
spaces
op <- parseOp
spaces
y <- parseInteger <|> do
_ <- char '('
z <- parseAExpr
_ <- char ')'
return $ z
return $ AExpr x op y
parseAssign :: Parser HVal
parseAssign = do
var <- oneOf ['a'..'z'] <|> oneOf ['A'..'Z']
spaces
_ <- string ":="
spaces
val <- parseHVal <|> do
_ <- char '('
z <- parseAExpr
_ <- char ')'
return $ z
return $ Assign [var] val
Upvotes: 2
Views: 85
Reputation: 50829
As you note, your parseFile
function parses a single expression (though maybe "statement" would be a better name) using the parseExpression
parser. You probably want to introduce a new parser for a "program" or sequence of expressions/statements:
parseProgram :: Parser [HVal]
parseProgram = spaces *> many (parseExpression <* spaces)
and then in parseFile
, replace parseExpression
with parseProgram
:
parseFile :: String -> IO [HVal]
parseFile file =
do program <- readFile file
case parse parseProgram "" program of
Left err -> fail "Parse Error"
Right parsed -> return $ parsed
Note that I've had to change the type here from HVal
to [HVal]
to reflect the fact that a program, being a sequence of expressions each of type HVal
, needs to be represented as some sort of data type capable of combining multiple HVal
s together, and a list [HVal]
is one way of doing so.
If you want a program to be an HVal
instead of an [HVal]
, then you need to introduce a new constructor in your HVal
type that's capable of representing programs. One method is to use a constructor to directly represent a block of statements:
data HVal = ... | Block [HVal]
Another is to add a constructor represent a sequence of two statements:
data HVal = ... | Seq HVal HVal
Both methods are used in real parsers. (Note that you'd normally pick one; you wouldn't use both.) To represent a sequence of three assignment statements, for example, the block method would do it directly as a list:
Block [Assign "a" (HInteger 1), Assign "b" (HInteger 2), Assign "c" (HInteger 3)]
while the two-statement sequence method would build a sort of nested tree:
Seq (Assign "a" (HInteger 1)) (Seq (Assign "b" (HInteger 2)
(Assign "c" (HInteger 3))
The appropriate parsers for these two alternatives, both of which return a plain HVal
, might be:
-- use blocks
parseProgram1 :: Parser HVal
parseProgram1 = do
spaces
xs <- many (parseExpression <* spaces)
return $ Block xs
parseProgram2 :: Parser HVal
parseProgram2 = do
spaces
x <- parseExpression
spaces
(do xs <- parseProgram2
return $ Seq x xs)
<|> return x
Upvotes: 4