Is the javascript encoding format inherited from the html document's format?

I am generating Javascript files for a project. The code is saved in JS text files using UTF8. This code may contain strings containing characters with accents. These strings may be displayed in sections, thus I have html escaped them using StringEscapeUtils from Apache Commons.

From here, I take that this practice is safe and sufficient for HTML documents using UTF8, but what about imported javascripts. Do they 'inherit' of the same format as the referencing html document?

Upvotes: 0

Views: 146

Answers (1)

Oded
Oded

Reputation: 499062

The character set is implied, as can be seen in this DTD fragment from the HTML 4.01 spec:

<!ELEMENT SCRIPT - - %Script;          -- script statements -->
<!ATTLIST SCRIPT
  charset     %Charset;      #IMPLIED  -- char encoding of linked resource --
  type        %ContentType;  #REQUIRED -- content type of script language --
  src         %URI;          #IMPLIED  -- URI for an external script --
  defer       (defer)        #IMPLIED  -- UA may defer execution of script --
  >

The actual character set the browser will use will be the one passed in on the charset header if supplied and if there is no charset attribute on the element itself.

If you want to ensure the right character set is used, put it in the script element declaration:

<script charset="UTF-8" ... ></script>

Upvotes: 1

Related Questions