Ihor M.
Ihor M.

Reputation: 3148

ANTLR4. How to create properly unicode range lexer rules?

In my grammar I'd like variables to be comprised of latin, cyrillic and mandarin characters. For this purposes I define lexer rule, like this: CYRILLIC_RANGE: [\u0400–\u04FF];
this is what I see in my ANTLRWorks 2.1 output when I try to run expression against my query: line 1:4 token recognition error at: 'н' What am I missing?

Upvotes: 0

Views: 1463

Answers (1)

Dan McGee
Dan McGee

Reputation: 171

I'm not sure what you are missing, as this seems to be working for me here. Have you tried the other range syntax? Both of these should be equivalent.

CYRILLIC_RANGE : [\u0400-\u04FF] ;
CYRILLIC_RANGE : '\u0400'..'\u04FF' ;

Upvotes: 2

Related Questions