MrG
MrG

Reputation: 5287

HTML encoding: eastern european languages

My program is fetching messages from a database, which contains English, German and several Eastern European languages. My Python script sets the encoding via:

<meta  http-equiv="Content-Type" content="text/html; charset=utf-8"/>

and use the values fetched correctly from the database (if I check within my logs).

Unfortunately all browsers I tested (IE8, Firefox 3.0.10, Opera 9.64) switch based on my local language settings to:

Everything works fine as soon as I switch the character encoding manually in the browser.

The same happens if I manually generate the HTML file using UTF-8 (tested with TextMate respective jEdit), although both editors display the content correctly.

That works fine for English and German, but i.e. not for Russian. How can I force the "correct" character encoding?

ANSWER

The following entry within the VirtualHost (Apache configuration) section did the trick for me:

AddDefaultCharset utf-8

Many thanks for pointing me into the right direction, that helped a lot!

Upvotes: 2

Views: 2517

Answers (1)

Gumbo
Gumbo

Reputation: 655489

When the document is transfered over HTTP, the HTTP header information are the crutial information:

[…] conforming user agents must observe the following priorities when determining a document's character encoding (from highest priority to lowest):

  1. An HTTP "charset" parameter in a "Content-Type" field.
  2. A META declaration with "http-equiv" set to "Content-Type" and a value set for "charset".
  3. The charset attribute set on an element that designates an external resource.

So make sure you declare the character encoding in the Content-Type header field and not just inside the document.

Upvotes: 3

Related Questions