Reputation: 457
From The Java Language Specification Java SE 8 Edition (section 3.8) it seems that Java letters must include ASCII letters
The "Java letters" include uppercase and lowercase ASCII Latin letters A-Z (\u0041-\u005a), and a-z (\u0061-\u007a), and, for historical reasons, the ASCII underscore (_, or \u005f) and dollar sign ($, or \u0024). The $ sign should be used only in mechanically generated source code or, rarely, to access pre-existing names on legacy systems.
but may not include other Unicode letters (since the following sentence contains may not must):
Letters and digits may be drawn from the entire Unicode character set, which supports most writing scripts in use in the world today, including the large sets for Chinese, Japanese, and Korean. This allows programmers to use identifiers in their programs that are written in their native languages.
Is that correct that the implementation conforms to the specification even if it doesn't support non-ASCII letters in the identifiers?
If that is the case then the sentence "This allows programmers to use identifiers in their programs that are written in their native languages." doesn't make much sense - since it advises using features which may not be supported by all implementations.
Upvotes: 1
Views: 63
Reputation: 3201
I think you misunderstand the usage of the word "may" here. The sentence is to be read as "It is allowed to draw letters and digits from the entire Unicode character set,..."
Thus, the implementation has to support the whole Unicode set.
Upvotes: 2