pyparsing how to SkipTo end of indented block?

Question

I am trying to parse a structure like this with pyparsing:

identifier: some description text here which will wrap
    on to the next line. the follow-on text should be
    indented. it may contain identifier: and any text
    at all is allowed
next_identifier: more description, short this time
last_identifier: blah blah

I need something like:

import pyparsing as pp

colon = pp.Suppress(':')
term = pp.Word(pp.alphanums + "_")
description = pp.SkipTo(next_identifier)
definition = term + colon + description
grammar = pp.OneOrMore(definition)

But I am struggling to define the next_identifier of the SkipTo clause since the identifiers may appear freely in the description text.

It seems that I need to include the indentation in the grammar, so that I can SkipTo the next non-indented line.

I tried:

description = pp.Combine(
    pp.SkipTo(pp.LineEnd()) +
    pp.indentedBlock(
        pp.ZeroOrMore(
            pp.SkipTo(pp.LineEnd())
        ),
        indent_stack
    )
)

But I get the error:

ParseException: not a subentry (at char 55), (line:2, col:1)

Char 55 is at the very beginning of the run-on line:

...will wrap
    on to the next line...
              ^

Which seems a bit odd, because that char position is clearly followed by the whitespace which makes it an indented subentry.

My traceback in ipdb looks like:

   5311     def checkSubIndent(s,l,t):
   5312         curCol = col(l,s)
   5313         if curCol > indentStack[-1]:
   5314             indentStack.append( curCol )
   5315         else:
-> 5316             raise ParseException(s,l,"not a subentry")
   5317

ipdb> indentStack
[1]
ipdb> curCol
1

I should add that the whole structure above that I'm matching may also be indented (by an unknown amount), so a solution like:

description = pp.Combine(
    pp.SkipTo(pp.LineEnd()) + pp.LineEnd() +
    pp.ZeroOrMore(
        pp.White(' ') + pp.SkipTo(pp.LineEnd()) + pp.LineEnd()
    )
)

...which works for the example as presented will not work in my case as it will consume the subsequent definitions.

pyparsing how to SkipTo end of indented block?

Answers (1)

Related Questions