Reputation: 329
So I'm doing a homework assignment to fill in implementation for a few functions that work with a custom Haskell parser (this only works with boolean logic). I'm trying to understand what the code means or is supposed to do on a high level. Some of this (marked with "(wrong?)") was implemented by me but I think most of it is wrong as I don't really understand what is going on.
What does it mean for Haskell to parse a con/disjunction ast? This should make something that looks like p T_OR q
in parseOr right? How about parseNegation, parseNot, and parseParens?
Sorry for the broad scope I just don't understand what these are supposed to be doing. This is all the relevant code so please give your best guess if you can't tell.
parseOr implements the single rule D --> C | D. The definition says:
- parse a conjunction AST and name it "p"
- ensure that the next symbol in the stream is T_OR
- parse a disjunction AST and name it "q"
- return the AST "AND p q"
> parseOr = do
> p <- parseConjunction
> eat T_OR
> q <- parseDisjunction
> return (OR p q)
parseDisjunction implements the CFG rules for the D nonterminal:
D --> C | D
D --> C
> parseDisjunction = parseOr <|> parseConjunction
parseAnd implements the rule C --> N & C.
> parseAnd = do
> p <- parseNegation
> eat T_AND
> q <- parseConjunction
> return (AND p q)
parseConjunction should implement the rules for the C nonterminal:
C --> N & C
C --> N
It will probably look almost identical to parseDisjunction, but with different
parsers being combined.
(wrong?)
> parseConjunction = parseAnd <|> parseDisjunction
parseNot should implement the rule N --> !N.
(wrong?)
> parseNot = do
> p <- parseNegation
> eat T_NOT
> return (NOT p)
parseNegation should implement the rules for the N nonterminal:
N --> !N
N --> A
(wrong?)
> parseNegation = parseNot <|> parseConjunction
parseParens should implement the rule A --> (D).
(wrong?)
> parseParens = do
> eat L_PAREN
> p <- parseParens
> eat R_PAREN
> return (p)
eat implementation
eat :: Eq t => t -> Parser t ()
eat x = Parser $ \ts ->
case ts of
[] -> Nothing
(t:ts)
| x == t -> Just ((), ts)
| otherwise -> Nothing
Upvotes: 0
Views: 214
Reputation: 51029
What you have here is a recursive descent parser written in a monadic style, sometimes called a "monadic parser". This a very common method for writing parsers in Haskell, and there are a number of very popular parser libraries (e.g., Parsec) that support writing parsers in this style.
It looks like this one might have been hand-written for this assignment, so you probably have some boilerplate code that includes a Parser
type with Applicative
, Monad
, and Alternative
instances defined, plus the eat
utility function you included in your question. You probably also have a parse
or runParser
helper function that would be used to actually apply one of the parse functions (e.g., parseDisjunction
) to a list of input tokens.
On the one hand, the nuts and bolts of how these parsers work can be pretty complicated. However, at a high level, you can quickly develop an intuitive understanding of how they behave. In fact, it looks like your professor probably thought that just one example (parseOr
with some comments) would be enough to explain it, though that might have been optimistic!
If it's not clear to you, the way this parser is supposed to be used is that we assume that some human-readable representation of a Boolean expression, say the string:
"A and B or C or not D"
has already been processed into a list of "tokens":
ts = [T_ATOM "A", T_AND, T_ATOM "B", T_OR, T_ATOM "C", T_OR, T_NOT, T_ATOM "D"]
Note that I didn't see T_ATOM
in your code, but I presume you have something like it in the definition of the token data type.
The purpose of the parser functions, like parseOr
, is to take this token list as input and produce an abstract syntax tree (AST) to represent it. In this case, the AST will look like this:
OR (AND (ATOM "A") (ATOM "B")) (OR (ATOM "C") (NOT (ATOM "D")))
So when the comments say "parse a conjunction AST and name it p
", they are referring to the fact that the statement:
p <- parseConjunction
will take some tokens from the beginning of the list of input tokens and turn them into AST representation of a conjunction. Specifically, if parseOr
is applied to the token list ts
above, this first statement p <- parseConjunction
will take the tokens:
T_ATOM "A", T_AND, T_ATOM "B"
from the beginning of the input list and parse them into the following AST which will be assigned to p
:
(AND (ATOM "A") (ATOM "B"))
Now, the nice thing about writing parsers in this style is that you don't have to explicitly manipulate the token list. Instead, you can write parsers in a very abstract manner that closely matches the formal rules of the grammar. For example, the rule:
D -> C | D
can be pretty directly translated into:
parseOr = do
p <- parseConjuction -- this parses the conjunction "C" on the RHS
eat T_OR -- this parses the terminal symbol "|"
q <- parseDisjunction -- this parses the disjunction "D" on the RHS
return (OR p q) -- the produces the AST result of the parse
-- which is the disjunction "D" on the LHS
while a choice between multiple grammar rules:
D -> C | D
D -> C
can be translated into:
parseDisjunction =
parseOr -- try to parse "C | D" RHS
<|> parseConjunction -- if that fails, try just the "C" RHS
Does that help?
Upvotes: 2