Kronos
Kronos

Reputation: 184

ANTLR4 : ordering problem of parser rules for a keyword used in several rules (AND, BETWEEN AND)

I am having a problem while parsing some SQL typed string with ANTLR4. The parsed string is :

WHERE a <> 17106
AND b BETWEEN c AND d
AND e BTW(f, g)

Here is a snippet of my grammar :

where_clause
    : WHERE element
    ;

element
    : element NOT_EQUAL_INFERIOR element
    | element BETWEEN element AND element
    | element BTW LEFT_PARENTHESIS element COMMA_CHAR element RIGHT_PARENTHESIS
    | element AND element
    | WORD
    ;

NOT_EQUAL_INFERIOR:     '<>';
LEFT_PARENTHESIS:       '(';
RIGHT_PARENTHESIS:      ')';
COMMA_CHAR:         ',';

BETWEEN:            B E T W E E N;
BTW:                B T W;

WORD ... //can be anything ... it doesn't matter for the problem.

This grammar generates a tree on that string (image)
(source: hostpic.xyz)

But as you can see on that same picture, the tree is not the "correct one".

ANTLR4 being greedy, it englobes everything that follows the BETWEEN in a single "element", but we want it to only take "c" and "d".

Naturally, since it englobes everything in the element rule, it is missing the second AND of the BETWEEN, so it fails.

I have tried changing order of the rules (putting AND before BETWEEN), I tried changing association to right to those rules (< assoc=right >), but those didn't work. They change the tree but don't make it the way I want it to be.

I feel like the error is a mix of greediness, association, recursivity ... Makes it quite difficult to look for the same kind of issue, but maybe I'm just missing the correct words.

Thanks, have a nice day !

Upvotes: 6

Views: 371

Answers (2)

Disclaimer: I don't actually use ANTLR (I use my own), and I haven't worked with the (rather hairy) SQL grammar in a while, so this may be off the mark, but I think to get what you want you'll have to do something along the lines of:

    ... 
    where_clause
         : WHERE disjunction
         ;

    disjunction
         : conjunction OR disjunction
         | conjunction
         ;

     conjunction
         : element AND conjunction
         | element
         ;

     element
         : element NOT_EQUAL_INFERIOR element
         | element BETWEEN element AND element
         | element BTW LEFT_PARENTHESIS element COMMA_CHAR element RIGHT_PARENTHESIS
         | WORD
         ; 
    ...

This is not the complete refactoring needed but illustrates the first steps.

Upvotes: 0

user319799
user319799

Reputation:

I think you misuse the rule element. I don't think SQL allows you to put anything as left and right limits of BETWEEN.

Not tested, but I'd try this:

expression
    : expression NOT_EQUAL_INFERIOR expression
    | term BETWEEN term AND term
    | term BTW LEFT_PARENTHESIS term COMMA_CHAR term RIGHT_PARENTHESIS
    | expression AND expression
    | term
    ;

term
    : WORD
    ;

Here your element becomes expression in most places, but in others it becomes term. The latter is a dummy rule for now, but I'm pretty sure you'd want to also add e.g. literals to it.

Upvotes: 1

Related Questions