Reputation: 3148
I'm using ANTLRWorks to test a grammar I came up with and one of the rules foresees usage of BULLET symbol •, but when parse tree is being built it escapes it every time. I also tried other chars from extended ASCII table and they are omitted as well. Is it a know bug or should I enable extended ASCII chars somehow?
Upvotes: 0
Views: 334
Reputation: 100029
ANTLR 3.x through 4.0 can match any UTF-16 code unit except U+FFFF. ANTLR 4.1 will be able to match U+FFFF as well. To match characters in the range U+10000 to U+10FFFF, you'll need to explicitly encode them as UTF-16 surrogate pairs in your grammar.
Upvotes: 1