Dmitry
Dmitry

Reputation: 23

Pyparsing grammar with Optional() subclass

I am parsing router output which has an pairs of Label : Value. Values can be omitted. To parse it I am using Optional() subclass with default value. Why parser ignore it (and ~White in value assingment)?

My code

from pyparsing import *

if __name__ == '__main__':
    text = '''
    Peer AS              : 65000            Peer Port            : 0    
    Peer Address         : 100.8.0.1
    Local AS             : 65000            Local Port           : 0    
    Remote Capability    : 
    Local AddPath Capabi*: Disabled
    Remote AddPath Capab*: Send - None
    Graceful Restart     : Disabled
    Import Policy        : None Specified / Inherited
    Export Policy        : Access_EXP 
    '''

    label_word = Word(printables, excludeChars=':')
    value_word = Word(printables)
    separator = Optional('*') + ': '

    label = Combine(OneOrMore(label_word | Suppress(White(' ', max=1)) + ~White())) + FollowedBy(separator)
    value = Combine(ZeroOrMore(Word(printables) | White(' ', max=1) + ~White()))
    attr_expr = label + Suppress(separator) + Optional(value, default='')

    result = Dict(OneOrMore(Group(attr_expr))).parseString(text)
    print(result.dump())

Result

- ExportPolicy: 'Access_EXP'
- GracefulRestart: 'Disabled'
- ImportPolicy: 'None Specified / Inherited'
- LocalAS: '65000'
- LocalPort: '0'
- PeerAS: '65000'
- PeerAddress: '100.8.0.1'
- PeerPort: '0'
- RemoteAddPathCapab*: 'Send - None'
- RemoteCapability: 'Local AddPath Capabi*: Disabled'

Problem

There is a problem with "RemoteCapability" which has an empty value but parsed as next string (another pair of label : value). How to solve it?

Upvotes: 1

Views: 190

Answers (1)

PaulMcG
PaulMcG

Reputation: 63709

When parsing multiple key-value pairs, where the value is optional, one generally has to use a negative lookahead as part of the multiple parser. That is, a value can only be a value if first you check to see that it is not a key.

The solution for you is in your definition of value, to first check that you are not parsing a label - if you are, then the parser has advanced to the next label and there is no value for the current one. To do this, just add ~label inside your ZeroOrMore repetition:

value = Combine(ZeroOrMore(~label + Word(printables) | White(' ', max=1) + ~White()))

With this change, I now get your desired parse output:

- ExportPolicy: 'Access_EXP'
- GracefulRestart: 'Disabled'
- ImportPolicy: 'None Specified / Inherited'
- LocalAS: 65000
- LocalAddPathCapabi*: 'Disabled'
- LocalPort: 0
- PeerAS: 65000
- PeerAddress: '100.8.0.1'
- PeerPort: 0
- RemoteAddPathCapab*: 'Send - None'
- RemoteCapability: ''

Upvotes: 1

Related Questions