Reputation: 10363
What is the difference between "try" and "lookAhead" functions in parsec?
Upvotes: 24
Views: 4461
Reputation: 74344
The combinators try
and lookAhead
are similar in that they both let Parsec "rewind", but they apply in different circumstances. In particular, try
rewinds failure while lookAhead
rewinds success.
By the documentation, try
"pretends that it hasn't consumed any input when an error occurs" while lookAhead p
"parses p
without consuming any input", but "if p
fails and consumes some input, so does lookAhead
".
So if you think of a parser running as walking along some streaming state and either failing or succeeding, which we might write in Haskell terms as
type Parser a = [Tokens] -> (Either Error a, [Tokens])
then try
ensures that if (try p) input ---> (Left err, output)
then input == output
and lookAhead
has it such that (lookAhead p) input ---> (Right a, output)
then input == output
, but if (lookAhead p) input ---> (Left err, output)
then they may be allowed to differ.
We can see this in action by looking at the code for Parsec directly, which is somewhat more complex than my notion of Parser
above. First we examine ParsecT
newtype ParsecT s u m a
= ParsecT {unParser :: forall b .
State s u
-> (a -> State s u -> ParseError -> m b) -- consumed ok
-> (ParseError -> m b) -- consumed err
-> (a -> State s u -> ParseError -> m b) -- empty ok
-> (ParseError -> m b) -- empty err
-> m b
}
ParsecT
is a continuation-based datatype. If you look at how one of them is constructed
ParsecT $ \s cok cerr eok eerr -> ...
You'll see how we have access to the State s u
, s
, and four functions which determine how we move forward. For instance, the fail
clause of ParsecT
's Monad
instance uses the eerr
option, constructing a ParseError
from the current input position and the passed error message.
parserFail :: String -> ParsecT s u m a
parserFail msg
= ParsecT $ \s _ _ _ eerr ->
eerr $ newErrorMessage (Message msg) (statePos s)
While the most primitive successful token parse (tokenPrim
) uses a complex sequence of events eventually culminating in calling cok
with an updated State s u
.
With this intuition, the source for try
is particularly simple.
try :: ParsecT s u m a -> ParsecT s u m a
try p =
ParsecT $ \s cok _ eok eerr ->
unParser p s cok eerr eok eerr
It simply builds a new ParsecT
based on the one passed to try, but with the "empty err"
continuation in place of the consumed error. Whatever parsing combinator next sees try p
will be unable to access its actual "consumed err"
continuation and thus try is protected from changing its state on errors.
But lookAhead
is more sophisticated
lookAhead :: (Stream s m t) => ParsecT s u m a -> ParsecT s u m a
lookAhead p = do{ state <- getParserState
; x <- p'
; setParserState state
; return x
}
where
p' = ParsecT $ \s cok cerr eok eerr ->
unParser p s eok cerr eok eerr
Examining just the where
-clause we see it depends on modifying the passed parser p
to use the "empty ok"
continuation in place of the "consumed ok"
continuation. This is symmetric to what try
did. Further, it ensures that the parser state is unaffected by whatever happens when this modified p'
is run via its do
-block.
Upvotes: 34