vanilla
vanilla

Reputation: 331

Is it possible to configure ANTLR grammar to use two token having the same structure?

For the below grammar,

grammar names;

fullname : TITLE? FIRST_NAME LAST_NAME;

TITLE : 'Mr.' | 'Ms.' | 'Mrs.' ;

FIRST_NAME : ('A'..'Z' | 'a'..'z')+ ;

LAST_NAME : ('A'..'Z' | 'a'..'z')+ ;

WHITESPACE : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+ -> skip ;

When parsing input like "Mr. John Smith", it throw exception

 mismatched input 'Smith' expecting LAST_NAME

Is it possible to configure ANTLR to handle this case? If not possible, what could be the alternative way to handle it?

Please note that it's not limited to this simple case.

Upvotes: 3

Views: 507

Answers (1)

Jacob Krall
Jacob Krall

Reputation: 28825

There is no syntactic difference between FIRST_NAME and LAST_NAME; you just need to assign them.

grammar names;

fullname : TITLE? first=NAME last=NAME;

TITLE : 'Mr.' | 'Ms.' | 'Mrs.' ;

NAME : ('A'..'Z' | 'a'..'z')+ ;

WHITESPACE : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+ -> skip ;

Then you can call get("first") and get("last") to extract the parsed values out of the match.

Upvotes: 5

Related Questions