Reputation: 901
I'm doing a test, how the Firefox encoding character.
But the fact confused me.
HTML code:
<html lang="zh_CN">
<head>
<title>some Chinese character</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<img src="http://localhost/xxx" />
</body>
The xxx is some Chinese characters. These character must be encode into format like %xx to transport by HTTP.
First, I encoding the source file in UTF-8. use firefox to open the html file. The img label will send a request, "xxx" character were encoded by UTF8.
I changed the meta into
<meta http-equiv="Content-Type" content="text/html; charset=gbk">
but nothing changed.
Second, I save the source file in ANSI, maybe GBK or GB2312.
when the charset=gbk, still encoding the character by UTF8.
BUT, when the charset=utf8, the characters were encoding by GBK. By the way, other Chinese character can't display in right way, e.g. the String in title.
How to control the browser's encoding behavior?
Upvotes: 1
Views: 2737
Reputation: 140236
UTF-8 is the standard for URL encoding. If you encode your source file physically in GBK, but use utf-8
in the content-type, you are just lying to the browser and will get inconsistent or non-working results.
When a new URI scheme defines a component that represents textual data consisting of characters from the Universal Character Set [UCS], the data should first be encoded as octets according to the UTF-8 character encoding [STD63]; then only those octets that do not correspond to characters in the unreserved set should be percent- encoded. For example, the character A would be represented as "A", the character LATIN CAPITAL LETTER A WITH GRAVE would be represented as "%C3%80", and the character KATAKANA LETTER A would be represented as "%E3%82%A2
Upvotes: 2