brno32
brno32

Reputation: 424

Why does pyparsing's optional always return a list

I'm trying to get the LIMIT from a SQL statement

query = "LIMIT 1"

LIMIT = "LIMIT"

int_num = pyparsing_common.signed_integer()

limit_clause = Optional(Group(LIMIT + int_num), None)
statement = limit_clause("limit")


if __name__ == "__main__":
    result = statement.parseString(query)
    print(result["limit"])

prints [['LIMIT', 1]]

This is of course a contrived example, but why does it return as [['LIMIT', 1]] instead of just 1? Is there a way to get it to just return a 1?

Upvotes: 2

Views: 315

Answers (1)

0 _
0 _

Reputation: 11484

According to the documentation of pyparsing:

  • the operator + is an Expression operator that "creates And using the expressions before and after the operator",
  • the class And is an Expression subclass that "construct with a list of ParserElements, all of which must match for And to match",
  • the class Group is a special subclass that "causes the matched tokens to be enclosed in a list",
  • the class Optional is an Expression subclass that "construct with a ParserElement, but this element is not required to match; can be constructed with an optional default argument, ...".

So roughly the + operator creates a list of the results 'LIMIT' and pyparsing.pyparsing_common.signed_integer(), and then the class Group creates a list containing this list. This explains why both 'LIMIT' and 1 appear in the result, and also why they are inside nested lists.

The reality is a little more complex, because the returned objects are not lists, but instances of the class pyparsing.ParseResults. Running the following code:

import pyparsing

# construct parser
LIMIT = 'LIMIT'
int_num = pyparsing.pyparsing_common.signed_integer()
limit_clause = pyparsing.Optional(pyparsing.Group(LIMIT + int_num), None)
statement = limit_clause('limit')
# parse a string
query = 'LIMIT 1'
result = statement.parseString(query)
print(repr(result))

prints:

([(['LIMIT', 1], {})], {'limit': [([(['LIMIT', 1], {})], {})]})

then the statement print(repr(result['limit'])) prints:

([(['LIMIT', 1], {})], {})

and the statement print(str(result['limit'])) prints:

[['LIMIT', 1]]

For posterity, this answer uses pyparsing == 2.4.7 (the current development version of pyparsing (GitHub repository) has been significantly restructured from a single module to a package, notably in commit 0b398062710dc00b952636bcf7b7933f74f125da).

A few version-related comments about the class ParseResults, which is used to represent each parser's result:

Upvotes: 2

Related Questions