Reputation: 231
My html document starts as follows:
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
</head>
אבגד
If I encode my document as UTF-8
, it appears correctly in the browser. If I encode as UTF-8 without BOM
(which I understand is more standard) I get unusual characters.
What am I doing wrong?
Upvotes: 1
Views: 2112
Reputation: 177584
Your web server is declaring that the encoding is ISO-8859-1
, and the browser is respecting that. Ironically enough, using a byte order mark sends a stronger signal to the browser that the encoding must actually be UTF-8. (The exact reason for this is complicated and boring.)
Fixing your web server depends on what the server is. If this is a static resource on disk served by Apache httpd, then something like AddCharset UTF-8 .html
will add the header.
If this resource is served dynamically, then you should make sure you add the proper HTTP headers when producing the response, something like self.send_header('Content-Type', 'text/html; charset=utf-8')
for Python's basic http server.
Upvotes: 1