Ruslan
Ruslan

Reputation: 43

How do I select information from a given block using pyparsing?

I need to get information from a specific block, why can't I stop analyzing when I encounter a keyword?

source_string = '''

Code: 49662 (ACTIVE)
****************** Source: ******************
Key: Value
Description: test description 1
Key: Value
****************** Base information ******************
Key: Value
Description: test description 2
Key: Value
****************** Additional information ******************
Key: Value
Description: test description 3
Key: Value

'''

grammar = pp.Combine(pp.OneOrMore( \
    pp.CaselessLiteral('Code') + pp.Optional(':') + pp.restOfLine() \
    + pp.SkipTo(pp.CaselessLiteral('Base information')).suppress() \
    ^ pp.CaselessLiteral('Description') + pp.restOfLine() \
    , stopOn = pp.CaselessLiteral('Additional information')
))
pprint(grammar.searchString(source_string).asList())

Result:

[['Code: 49662 (ACTIVE)'], ['Description: test description 2'], ['Description: test description 3']]

The first value of "test description 1" I excluded by use SkipTo(), how to exclude from results " test description 3"?

Upvotes: 1

Views: 48

Answers (1)

Ruslan
Ruslan

Reputation: 43

This solved my problem

source_string = '''

Code: 49662 (ACTIVE)
****************** Source: ******************
Key: Value
Description: test description 1
Key: Value
Source: ip adress1
****************** Base information ******************
Source: ip adress
Key: Value
Description: test description 2

Key: Value
****************** Additional information ******************
Key: Value
Description: test description 3
Key: Value
Source: ip adress3
'''

grammar = pp.CaselessLiteral('Code') + pp.restOfLine() \
    ^ pp.Combine(pp.CaselessLiteral('Base information') + pp.restOfLine()).suppress() \
        + pp.OneOrMore(
            pp.Combine(pp.CaselessLiteral('Description') + pp.restOfLine()) \
            ^ pp.Combine(pp.CaselessLiteral('Source') + pp.restOfLine()) \
            ^ pp.Word(pp.printables).suppress()
            , stopOn = '*'
        )


pprint(grammar.searchString(source_string).asList())

Upvotes: 1

Related Questions