Reputation: 325
Background: There is a table, events; this table is formatted latin1. Individual columns in this table are set to utf8. The column we will cherry pick to discuss is 'title' which is one of the utf8 columns. The website is set for utf8 both via apache and the meta tag.
As a test, if I save décor or ©
into the title field and perform
select title, LENGTH(title) as len, CHAR_LENGTH(title) as chlen
from events where length(title) != char_length(title)
I will get décor or ©, 12, 10
back as a result; which is expected showing that the data has indeed been properly saved into my utf8 column.
However, upon echoing the title out to a page, it's mangeld into d�cor or �
which makes no sense to me since, as mentioned before, the character encoding is set to utf-8 on the page.
Not sure if this final detail makes a difference but if I edit the page and resubmit the mangled text it turns into d%uFFFDcor or %uFFFD
both in the database and when displayed to the page. Further submits cause no change.
Actual Question: Does anyone have an idea as to what I may be doing wrong? :-P
Upvotes: 1
Views: 975
Reputation: 165201
Well, there's likely one of three problems.
1. Mysql's connection is not using UTF-8
This means that it's converted to another charset (likely Latin-1) before it hits PHP. I've found the best solution is to run the following queries:
SET CHARACTER SET = "utf8";
SET character_set_database = "utf8";
SET character_set_connection = "utf8";
SET character_set_server = "utf8";
2. The page rendered is not really set to UTF-8
Set both the Content-type
header and the <meta>
tag content types to UTF-8. Some browsers don't respect one or the other...
header ('Content-Type: text/html; charset=UTF-8');
echo '<meta http-equiv="content-type" content="text/html; charset=utf-8" />';
As noted in the comments, that's not the problem...
3. You're doing something to the string before echoing it
Most of PHP's string functions will not do well with UTF-8. If you're calling a normal function that doesn't accept a $charset
parameter, the chances are that it won't work with utf-8 strings (such as str_replace
). If it does have a $charset
parameter (like htmlspecialchars
, make sure that you set it.
echo htmlspecialchars($content, ENT_COMPAT, 'UTF-8');
Upvotes: 2