smolloy
smolloy

Reputation: 368

Combining OR statements with Python SLY

I'm using SLY with Python to analyse a text with a relatively simple grammar. The strings I would like to analyse contain signal names from a certain system combined using AND or OR statements. That is, something like, "(SIG1 OR SIG2 OR SIG3 OR SIG4 OR SIG5) AND SIG6".

One characteristic of the data is that it will be dominated by long lists of signals OR'ed together. The parser I have built so far (see below) only understands the OR or AND operators as binary, and so outputs a nested tuple like the following:('AND', ('OR', ('OR', ('OR', ('OR', 'SIG1', 'SIG2'), 'SIG3'), 'SIG4'), 'SIG5'), 'SIG6')

Given the list of statements OR'ed together, it would be nice to combine these in OR statements of arbitrary length: ('AND', ('OR', 'SIG1', 'SIG2', 'SIG3', 'SIG4', 'SIG5'), 'SIG6')

I guess I have to edit the Parser, but I can't think how, and would appreciate any pointers you can give me.

class BoolLexer(Lexer):
    tokens = { ID, LPAREN, RPAREN, AND, OR }
    ignore = ' \t\n'

    ID      = r'[a-zA-Z_\.][a-zA-Z0-9_\.]*'
    LPAREN  = r'\('
    RPAREN  = r'\)'

    ID['AND'] = AND
    ID['OR'] = OR

class BoolParser(Parser):
    tokens = BoolLexer.tokens

    @_('expr AND term')
    def expr(self, p):
        return ('AND', p.expr, p.term)

    @_('expr OR term')
    def expr(self, p):
        return ('OR', p.expr, p.term)

    @_('term')
    def expr(self, p):
        return p.term

    @_('ID')
    def term(self, p):
        return p.ID

    @_('LPAREN expr RPAREN')
    def term(self, p):
        return p.expr

Upvotes: 2

Views: 1368

Answers (1)

rici
rici

Reputation: 241861

You can do this in a couple of ways. The most straight-forward if you want to catch all cases (including, for example, "(SIG1 OR SIG2) OR (SIG3 OR SIG4)") is to first build an AST and then recursively walk the AST, simplifying each node.

You can also do the simplification when you create the AST node, but that's not going to catch the case mentioned above:

@_('expr OR term')
def expr(self, p):
    if (isinstance(expr, tuple) and expr[0] is "OR"):
        return p.expr + (p.term,)
    else:
        return ('OR', p.expr, p.term)

I find that pretty ugly, though, because of the test at line 3. A cleaner solution is to separate the cases in the grammar. (Note: like your grammar, the following gives AND and OR equal precedence, simply associating left-to-right. That is not the usual way to write boolean expressions.)

@_('and_expr',
   'or_expr',
   'term)
def expr(self, p):
    return p[0]

@_('term OR term')
def or_expr(self, p):
    return ('OR', p.term0, p.term1)

@_('or_expr OR term')
def or_expr(self, p):
    return p.or_expr + (p.term,)

@_('term AND term')
def and_expr(self, p):
    return ('AND', p.term0, p.term1)

@_('and_expr AND term')
def and_expr(self, p):
    return p.and_expr + (p.term,)

(I've never used SLY and I didn't check any of the above code. If it doesn't work, let me know.)

Upvotes: 2

Related Questions