jdsmith2816
jdsmith2816

Reputation: 325

php 5.2 + mysql 5.1 character encoding issue

Background: There is a table, events; this table is formatted latin1. Individual columns in this table are set to utf8. The column we will cherry pick to discuss is 'title' which is one of the utf8 columns. The website is set for utf8 both via apache and the meta tag.

As a test, if I save décor or © into the title field and perform

select title, LENGTH(title) as len, CHAR_LENGTH(title) as chlen 
from events where length(title) != char_length(title)

I will get décor or ©, 12, 10 back as a result; which is expected showing that the data has indeed been properly saved into my utf8 column.

However, upon echoing the title out to a page, it's mangeld into d�cor or � which makes no sense to me since, as mentioned before, the character encoding is set to utf-8 on the page.

Not sure if this final detail makes a difference but if I edit the page and resubmit the mangled text it turns into d%uFFFDcor or %uFFFD both in the database and when displayed to the page. Further submits cause no change.

Actual Question: Does anyone have an idea as to what I may be doing wrong? :-P

Upvotes: 1

Views: 975

Answers (1)

ircmaxell
ircmaxell

Reputation: 165201

Well, there's likely one of three problems.

1. Mysql's connection is not using UTF-8

This means that it's converted to another charset (likely Latin-1) before it hits PHP. I've found the best solution is to run the following queries:

SET CHARACTER SET = "utf8";
SET character_set_database = "utf8";
SET character_set_connection = "utf8";
SET character_set_server = "utf8";

2. The page rendered is not really set to UTF-8

Set both the Content-type header and the <meta> tag content types to UTF-8. Some browsers don't respect one or the other...

header ('Content-Type: text/html; charset=UTF-8');

echo '<meta http-equiv="content-type" content="text/html; charset=utf-8" />';

As noted in the comments, that's not the problem...

3. You're doing something to the string before echoing it

Most of PHP's string functions will not do well with UTF-8. If you're calling a normal function that doesn't accept a $charset parameter, the chances are that it won't work with utf-8 strings (such as str_replace). If it does have a $charset parameter (like htmlspecialchars, make sure that you set it.

echo htmlspecialchars($content, ENT_COMPAT, 'UTF-8');

Upvotes: 2

Related Questions