user10870702
user10870702

Reputation:

How to specify multiple lexer rules in a single rule?

I have the following parser rule:

declaration     : (KW_VARIABLE DT_IDENTIFIER) |
                  (KW_VARIABLE DT_IDENTIFIER OP_ASSIGNMENT DT_DATA_TYPES) OP_SEMICOLON;

and the following lexer rules:

KW_VARIABLE     : 'var';

OP_ASSIGNMENT   : '=';
OP_SEMICOLON    : ';';

DT_IDENTIFIER   : [a-z]+;
DT_INTEGER      : [0-9]+;
DT_DATA_TYPES   : (DT_IDENTIFIER | DT_INTEGER);

With the above rules, I want to be able to write the following code:

var a = 10;
var b = 40;
var c = 50;
var d = c;

My listener code for exiting declaration looks like this:

public override void ExitDeclaration([NotNull] PyroParser.DeclarationContext context)
{
    bool isAssigned = context.OP_ASSIGNMENT() != null;

    if (!isAssigned)
    {
        return;
    }

    Console.WriteLine(context.DT_DATA_TYPES().GetText());
    base.ExitDeclaration(context);
}

I get an error on the first line when I run saying:

line 1:8 mismatched input '10' expecting DT_DATA_TYPES

I just want to be able to refer to all data types in a single rule, how can I do this?

Upvotes: 0

Views: 59

Answers (1)

Bart Kiers
Bart Kiers

Reputation: 170308

This is incorrect:

DT_IDENTIFIER   : [a-z]+;
DT_INTEGER      : [0-9]+;
DT_DATA_TYPES   : (DT_IDENTIFIER | DT_INTEGER);

Once a DT_IDENTIFIER or DT_INTEGER is matched, it will never become a DT_DATA_TYPES. The lexer matches rules from top to bottom, and once a match is found, it will not give it up. And simply changing the order of the rules:

DT_DATA_TYPES   : (DT_IDENTIFIER | DT_INTEGER);
DT_IDENTIFIER   : [a-z]+;
DT_INTEGER      : [0-9]+;

will not work either: that way the lexer will never produce DT_IDENTIFIER and DT_INTEGER tokens.

You could do something like this instead:

dt_data_types   : (DT_IDENTIFIER | DT_INTEGER);

DT_IDENTIFIER   : [a-z]+;
DT_INTEGER      : [0-9]+;

Upvotes: 2

Related Questions