Rohit P
Rohit P

Reputation: 357

using unicode in Javascript

In JavaScript we can use the below line of code(which uses Unicode) for displaying copyright symbol:

var x = "\u00A9 RPeripherals";

Why can't we type the copyright symbol directly using ALT code (alt+0169) like below :

var x = "© RPeripherals" ;

What is the difference between these two methods?

Upvotes: 0

Views: 5946

Answers (2)

Quentin
Quentin

Reputation: 944474

Why can't we directly type the copyright symbol directly

Because JavaScript engines are capable of parsing UTF-8 encoded source files.

What is the difference between these two methods?

One is short, requires the source file be encoded in an encoding that supports the character, and requires that you type a character that isn't printed on the keyboard's buttons.

The other is (comparatively) long, can be expressed entirely in ASCII, and can be typed with characters printed on the buttons of a standard keyboard.

Upvotes: 0

user797257
user797257

Reputation:

Why can't we type the copyright symbol directly using ALT code (alt+0169) like below :

Who says so? Of course you can. Just configure your code editor to use UTF-8 encoding for source files. You should never use anything else to begin with...

What is the difference between these two methods?

The difference is that using the \uXXXX scheme you are transmitting at best 2 and at worst 5 extra bytes on the wire. This kind of spelling may help if you need to embed characters in your source code, which your font cannot display properly. For example, I don't have traditional Chinese characters in the font I'm using for programming, so if I type Chinese characters into my code editor, I'll see a bunch of question marks or rectangles with Unicode codepoint digits instead of actual characters. But someone who has Chinese glyphs in the font wouldn't have that problem.

If me and that person want to share our source code, it would be preferable that the other person uses \uXXXX scheme, as I would be able to verify which character is that by looking it up in the Unicode table. That's about all the difference.

EDIT

ECMAScript standard (v 262/5.1) says specifically that

A conforming implementation of this Standard shall interpret characters in conformance with the Unicode Standard, Version 3.0 or later and ISO/IEC 10646-1 with either UCS-2 or UTF-16 as the adopted encoding form, implementation level 3. If the adopted ISO/IEC 10646-1 subset is not otherwise specified, it is presumed to be the BMP subset, collection 300. If the adopted encoding form is not otherwise specified, it presumed to be the UTF-16 encoding form.

So, the standard guarantees that character encoding is Unicode, and enforces the use of UTF-16 (that's strange, I thought it was UTF-8), but I don't think that this is what happens in practice... I believe that browsers use UTF-8 as default. Perhaps this have changed in the later standards, but this is the one last universally accepted.

Upvotes: 2

Related Questions