McManiaC
McManiaC

Reputation: 91

Nested parsers in happy / infinite loop?

I'm trying to write a parser for a simple markup language with happy. Currently, I'm having some issues with infinit loops and nested elements.

My markup language basicly consists of two elements, one for "normal" text and one for bold/emphasized text.

data Markup
    = MarkupText   String
    | MarkupEmph   [Markup]

For example, a text like Foo *bar* should get parsed as [MarkupText "Foo ", MarkupEmph [MarkupText "bar"]].

Lexing of that example works fine, but the parsing it results in an infinite loop - and I can't see why. This is my current approach:

-- The main parser: Parsing a list of "Markup"
Markups     :: { [Markup] }
            : Markups Markup                    { $1 ++ [$2] }
            | Markup                            { [$1]       }

-- One single markup element
Markup      :: { Markup }
            : '*' Markups1 '*'                  { MarkupEmph $2 }
            | Markup1                           { $1            }

-- The nested list inside *..*
Markups1    :: { [Markup] }
            : Markups1 Markup1                  { $1 ++ [$2] }
            | Markup1                           { [$1]       }

-- Markup which is always available:
Markup1     :: { Markup }
            : String                            { MarkupText $1 }

What's wrong with that approach? How could the be resolved?

Update: Sorry. Lexing wasn't working as expected. The infinit loop was inside the lexer. Sorry. :)

Update 2: On request, I'm using this as lexer:

lexer :: String -> [Token]
lexer [] = []
lexer str@(c:cs)

    | c == '*'              = TokenSymbol "*"   : lexer cs
    -- ...more rules...
    | otherwise             = TokenString val   : lexer rest

  where (val, rest) = span isValidChar str
        isValidChar = (/= '*')

The infinit recursion occured because I had lexer str instead of lexer cs in that first rule for '*'. Didn't see it because my actual code was a bit more complex. :)

Upvotes: 2

Views: 450

Answers (1)

Andrew T Finnell
Andrew T Finnell

Reputation: 13628

Just a warning it has been a while since I've dealt with parser generators.

It would appear you need a LR(1) parser which im not sure Happy is. I am positive once I write this someone will be able to correct me.

If your parser can't look ahead it will be stuck on this statement forever

Markups1    :: { [Markup] }
        : Markups1 Markup1 
        | Markup1

It will look for a Markups1, which in turn looks for a Markups1. Best I can guess, it isnt peforming a look ahead to Markup1 to see if it is a string.

Try rewriting it like this

Markups1    :: { [Markup] }
        : Markup1 Markups1
        | 

Essentially you want it to find the string first, then try to look for another string, if it doesn't find one it needs to end that statement.

Upvotes: 1

Related Questions