Reputation: 23976
Say I have a script like this:
<script type="text/javascript" src="myScript.js">
I've seen some sources online that claim that if the charset
attribute is omitted, it defaults to ISO-8859-1. I've seen others that claim it assumes the same encoding as the HTML page that contains the script tag. What's the truth?
I need to know because my JavaScript file contains literal strings that will be inserted into the HTML, and which include non-ASCII characters like the Euro symbol (€). I realize that adding a charset attribute or just HTML encoding these characters should solve my problem, but I'd still like to understand the default behavior.
EDIT: To clarify one point, I need to know not just what the standards say, but how browsers actually act. The behavior described here: http://joconner.com/2008/09/javascript-file-encoding/ seems to suggest that browsers don't always assume ISO-8859-1.
Upvotes: 18
Views: 4308
Reputation: 382802
HTML5 4.11.1 The script element:
If the script element has a charset attribute, then let the script block's character encoding for this script element be the result of getting an encoding from the value of the charset attribute.
Otherwise, let the script block's fallback character encoding for this script element be the same as the encoding of the document itself.
The quote links to the DOM document
element, which has an encoding
property.
TODO: find how the encoding of that object is determined from the standards.
Upvotes: 1
Reputation: 690
According to w3schools.com the value is ISO-8859-1 and this is supported across all major browsers.
According to the HTTP 1.1 specification:
When no explicit charset parameter is provided by the sender, media subtypes of the "text" type are defined to have a default charset value of "ISO-8859-1" when received via HTTP. Data in character sets other than "ISO-8859-1" or its subsets MUST be labeled with an appropriate charset value. See section 3.4.1 for compatibility problems.
So anything that doesn't conform to this does not technically follow the HTTP 1.1 specification.
Upvotes: 2
Reputation: 26183
The w3c has a standard way for a browser to determine the char encoding, you can read about it here: http://www.w3.org/TR/html4/charset.html#spec-char-encoding
To sum up, conforming user agents must observe the following priorities when determining a document's character encoding (from highest priority to lowest):
- An HTTP "charset" parameter in a "Content-Type" field.
- A META declaration with "http-equiv" set to "Content-Type" and a value set for "charset".
- The charset attribute set on an element that designates an external resource.
In addition to this list of priorities, the user agent may use heuristics and user settings. For example, many user agents use a heuristic to distinguish the various encodings used for Japanese text. Also, user agents typically have a user-definable, local default character encoding which they apply in the absence of other indicators.
Upvotes: 7
Reputation: 1514
HTML encoding strings and passing them into javascript variables can cause problems, specially if you use hex codes as js I'm told prefers octal.
If you can work in utf-8 as the charset of your web pages then js works with these just fine. I use this a lot and there has never been a need to define a charset for the included script files.
Upvotes: 0