chandankumar patra
chandankumar patra

Reputation: 119

Why does this code showing error invalid unicode?

//System.out.println("hii");'\uxxx'

The println statement is commented but the unicode is not commented.Why?

Upvotes: 6

Views: 14775

Answers (4)

Simon Nickerson
Simon Nickerson

Reputation: 43157

javac converts \u escapes before it does anything else, including handling comments. So when it sees:

\uxxx

it identifies this as an invalid Unicode escape and stops the compilation with an error.

Upvotes: 2

pranay
pranay

Reputation: 21

When the specification for the Java language was created, the Unicode standard was accepted and the char primitive was defined as a 16-bit data type, with characters in the hexadecimal range from 0x0000 to 0xFFFF.

Also you should use "\0001" instead of "/0001".

Upvotes: 2

RealSkeptic
RealSkeptic

Reputation: 34638

Java allows you to use Unicode in your source code. Unlike many other languages, it allows you to do so anywhere, including, of course, comments. And it allows it in identifiers as well, so you can write legal Java code like this:

    String हिन्दी = "Hindi";

The variable name is perfectly legal (although coding conventions discourage such use).

So as far as javac is concerned, the source code is Unicode. The problem is that it can be represented with different encodings, and some editors don't support Unicode, and there are places where using a non-ASCII file is going to create problems.

So it is allowed to use Unicode escapes in the code. This will make the file be entirely in ASCII despite having identifiers or comments in Unicode. You can replace any character in the code with the equivalent Unicode escape. Even the "normal" characters like ;. For example, the following line:

String s = "123";

Can be written as:

String s \u003d "123"\u003b

And it will be compiled correctly and without any problems. You can, in fact, write the whole program in Unicode escapes, including the newlines. The Java compiler simply doesn't care if the Unicode escapes are inside literals or in the source itself.

But the upshot of this is that the compiler needs to interpret Unicode escapes first, and only then break the source into tokens such as identifiers, operators and comments, and after that it checks syntax etc.

Which means that if you have an illegal Unicode escape sequence in your source, it will be flagged as an error even though it's inside a comment, because at this point the compiler doesn't even know that the particular part of the code it is looking at is a comment.

Upvotes: 12

Akash Thakare
Akash Thakare

Reputation: 23002

Unicode can be represented with \uCODE and not /uCODE. If your unicode is new line and you try to write something after unicode it may show you compile time error.Otherwise inline unicodes are commented in single line comment.No need to specifically comment unicode.

//Compilation Error
//System.out.println("hii"); \u000d Hello

EDIT

When compiler starts it replaces all unicode character with it's value including the characters of comment.

So in above statement during compilation it becomes.

//System.out.println("hii");
Hello

Upvotes: 2

Related Questions