Reputation: 20726
I have a strange problem with some documents on my webpage.
My data is stored in a MYSQL Database, UTF8 encoded. If read the values my webbpage displays
Rezept : Gem�se mal anders (Gem�selaibchen)
I need ü / ü!
Content in the database is "Gemüse ... " ..
The raw data in my error_log looks like this
[title] => Rezept : Gemüse mal anders (Gemüselaibchen)
The webpage header is:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<!--[if IE]>
<link rel="stylesheet" href="http://www.dev-twitter-gewitter.com/css//blueprint/ie.css"
type="text/css" media="screen, projection">
<![endif]-->
<meta name="text/html; charset=UTF-8" content="Content-Type" />
Upvotes: 7
Views: 916
Reputation: 72504
You have to set the encoding of your web page.
There are three ways to set the encoding:
HTML/XHTML: Use a HTTP header:
Content-Type: text/html; charset=UTF-8
HTML: Use a meta element: (Also possible for XHTML, but somewhat unusually)
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
XHTML only: Set the encoding in the preamble: (Preferred for XHTML)
<?xml version="1.0" encoding="UTF-8"?>
If you want to verify the problem first:
First change the encoding manually using your browser. If that works you can set it in your HTML file. Make sure you reset the manual encoding to automatic detection, otherwise it'll work on your workstation, but not on your users' workstations!
A PHP speciality: Make sure your internal encoding is set to UTF-8, too! All outputs are converted to this encoding.
You can enforce the internal encoding using mb_internal_encoding
at the top of every file.
After all: All this doesn't help if your code isn't actually UTF-8 encoded! If it is, check if there are any helper functions which might destroy the UTF-8 encoding.
Upvotes: 12
Reputation: 5062
The problem is likely that the connection to the database uses latin1. This is from what I know the default in many MySQL setups.
That means, even if you store the data as utf-8 in the database you will get it as latin1 when you fetch it, as the charset is converted on the fly to match the connection.
You have two options:
1. Change the default connection character set to be utf-8
This could mean trouble if you have other applications hosted on the same database server that expect iso-8859-1 from the database as when you change the config you will change the behaviour for all users of the MySQL server.
2. Change the connection charset after each connect to the database
If you use PHP5 you can use the built in command:
mysql_set_charset('utf8');
See http://php.net/manual/en/function.mysql-set-charset.php for more details.
If you are on PHP 4 you can do this by a simple SQL query like so:
mysql_query("SET NAMES 'UTF8'");
See http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html for more details.
Upvotes: 2
Reputation: 8876
You should check the HTML headers too, especially (if wrong) how your webserver is configured. I had a similar issue in the past which was caused by the configuration of apache -- it was configured to always send the encoding in the content-type, and that overwrote the encoding passed via the <meta>
tag as HTML page and webserver differed in that value.
Upvotes: 0
Reputation: 655129
That Unicode replacement character � only appears when the encoding is incorrect. So in your case you declared your data as UTF-8 encoded but it wasn’t (at least the part you quoted). The ü encoded in ISO 8859-1 is 0xFC, but that’s an invalid octet in UTF-8.
So you need to make sure that your data is actually encoded with UTF-8. There are functions that can check if a given string is UTF-8, e.g. mb_detect_encoding
or this is_utf8
function.
Upvotes: 5
Reputation: 54989
MySQL needs to know you want the output as UTF-8 - it's likely configured to send as latin1, so your browser sees the invalid UTF-8 byte sequences and outputs the "not a character" glyph.
Send the query "SET NAMES utf8" immediately after opening the MySQL connection, or change the configuration (if possible).
Upvotes: 8
Reputation: 20726
utf8_encode fixed my problem. Iam not sure why (; the data in the database is utf8 , the website is utf8 ..
Upvotes: 0
Reputation: 281365
Do this:
header('Content-Type: text/html; charset=utf-8');
before outputting any content.
Upvotes: 3