Reputation: 4098
I have a db in UTF-8 encoding with a mixture of Latin-1. (I think that that is the problem)
This is how the characters look in the database.
Ä° (should be İ)
è
When I set the header to
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
Then the characters come out as:
İ
�
When I remove the header, they come out as they are in the database. I want them to come out like this:
İ
è
I'm looking for a way to remedy this in PHP after the fact, if it is possible. I am unable to correct the data itself at this time, which would be the correct thing to do.
Upvotes: 5
Views: 39698
Reputation: 2528
I know this is an old post but in case something comes across this issue, here are what I did to solve the problem.
1) export table(s) to sql
2) open sql with notepad++ or other editor
3) copy all then paste it to a new file with BOM (or notepad and save as unicode)
4) I have this on my exported file:
/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES latin1 */;
which I change SET NAMES from latin1 to utf8
/*!40101 SET NAMES utf8 */;
if you don't have this line just simply add this new line and from
CREATE TABLE IF NOT EXISTS `table_name` (
// column names....
) ENGINE=MyISAM AUTO_INCREMENT=301 DEFAULT CHARSET=latin1;
change
DEFAULT CHARSET=latin1;
to
DEFAULT CHARSET=utf8;
delete the old tables (backup old tables of course) and import this new file.
It worked for me. Hope that helps.
Upvotes: 1
Reputation: 3367
Maybe you should choose the utf8 as the connection character set which will retrieve the characters right. The default one might be not right for your required characters.
More details here mysql_set_charset
Upvotes: 2
Reputation: 99
You have to collate 3 things in this case. Almost does not matter what is the character coding of a DB table's content, because in MySQL you can set the character coding of the communication between the DB server and your PHP script. See http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html If you use SET NAMES / SET CHARACTER SET the right way, you can set the communication as to get UTF-8 characters anyway.
You need to check the "physical" (byte-level) character coding of your PHP script file. Set it to UTF-8 in the text editor / IDE whichever you use.
You need to use the appropriate HTML header, you wrote it correctly above.
If all things match properly, the result should be alright.
The only possible trouble, when the textual content in the DB table have been stored with a incorrect char coding.
Upvotes: 1
Reputation: 437336
Your HTML output needs to be in a single encoding, there is no way around that. This means that content in different encodings needs to be converted to your HTML encoding first. While that is possible to do with iconv
or mb_convert_encoding
, there are two problems you have to solve:
For example, a theoretical solution would be to pick UTF-8 as your HTML encoding and then do this for all strings you are going to output:
$string = '...'; // from the database
// If it's not already UTF-8, convert to it
if (mb_detect_encoding($string, 'utf-8', true) === false) {
$string = mb_convert_encoding($string, 'utf-8', 'iso-8859-1');
}
echo $string;
The code above assumes that non-UTF-8 content is encoded in latin-1, which is reasonable according to your question.
Upvotes: 17