Doğan Uçar
Doğan Uçar

Reputation: 186

PHP - bad encoded turkish characters in MySQL database

I am working on a turkish website, which has stored many malformed turkish characters in a MySQL database, like:

 - ş as þ
 - ı as ý
 - ğ as ð
 - Ý as İ

i can not change the data in the database, because the database are updated daily and the new data will contain the malformed characters again. So my idea was to change the data in PHP instead of changing the data in the database. I have tried some steps:

Turkish characters are not displayed correctly

Fix Turkish Charset Issue Html / PHP (iconv?)

PHP Turkish Language displaying issue

PHP MYSQL encoding issue ( Turkish Characters )

I am using the PHP-MySQLi-Database-Class available on GitHub with utf8 as charset.

I have even tried to replace the malformed characters with str_replace, like:

$newString = str_replace ( chr ( 253 ), "ı", $newString );

My question is, how can i solve the issue without changing the characters in the database? Are there any best practices? Is it a good option just to replace the characters?

EDIT: solved it by using

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-9" />

Upvotes: 3

Views: 4528

Answers (3)

AMN
AMN

Reputation: 21

2022 update. I made a wide research and I found this solution and it's working. let's say your db_connection is $mysqli:

$mysqli = mysqli_connect($hostname, $username, $password, $database) OR DIE ("Baglanti saglanamadi!");

just add this line after. it works like magic with all languages even Arabic:

mysqli_set_charset($mysqli, 'utf8');

Upvotes: 2

Rick James
Rick James

Reputation: 142218

SELECT CONVERT(CONVERT(UNHEX('d0dddef0fdfe') USING ...) USING utf8);

latin5 / iso-8859-1 shows ĞİŞğış
latin1 / iso-8859-9 shows ÐÝÞðýþ

You are confusing two similar encodings; see the first paragraph in https://en.wikipedia.org/wiki/ISO/IEC_8859-9 .

"Collation" is only for sorting. But first you need to change the CHARACTER SET to latin5. Then change the collation to latin5_turkish_ci. (Since that is the default for latin5, no action need be taken.)

This may suffice to make the change in MySQL: EDIT 3

NO, this is probably wring -- ALTER TABLE tbl CONVERT TO CHARACTER SET latin5;

After seeing more of the issue, this "2-step ALTER" is probably correct:

ALTER TABLE Tbl MODIFY COLUMN col VARBINARY(...) ...;
ALTER TABLE Tbl MODIFY COLUMN col VARCHAR(...) ... CHARACTER SET latin5 ...;

Do that for each table. Be sure to test this on a copy of your data first.

The 2-step ALTER is useful for when the bytes are correct, but the CHARACTER SET is not.

CONVERT TO should be used when the characters are correct, but you want a different encoding (and CHARACTER SET). See Case 5.

Edit 1

E7 and FD and cp1250, dec8, latin1 and latin2 for ç and ý. FD in latin5 is ı. I conclude that your encoding is latin1, not latin5.

You say you cannot change the "scripts". Let's look at your limitations. Are you restricted on the INSERT side? Or the SELECT side? Or both? What is rendering the text; html? MySQL is willing to change from latin1 to/from latin5 and you insert/select (based on a few settings). And/or you could lie to HTML (via a meta tag) to get it to interpret the bytes differently. Please spell out the details of the data flow.

Edit 2

Given that the HEX in the table is E7FD6B6172FD6C6D6173FD6E61, and it should be rendered as çıkarılmasına, ... Note especially the second letter needs to show as ı (Turkish dotless small I), not ý (small Y with acute), correct?

Start by trying

<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-9"/>

That should give you the `latin5 rendering, as you already found out. IANA Reference.

As for "Best practice", that would involve changing the way text is inserted. You have stated this as off-limits.

Apparently you have latin5 characters stored in a latin1 column. Since latin1 does not involve any checking, you can insert and retrieve latin5 characters without any trouble.

This does not address the desire to have Turkish collation. If necessary, I can probably concoct a way to specify Turkish ordering on particular statements; please provide a sample statement.

Upvotes: 0

Fahed Alkaabi
Fahed Alkaabi

Reputation: 269

Two solutions are good

PHP MYSQL encoding issue ( Turkish Characters )

PHP Turkish Language displaying issue

Also you can set configuration on phpMyAdmin

Operations > Table options > Collation > select utf8_general_ci

if you create the tables already edit the collation structures also

Upvotes: 0

Related Questions