user2287202
user2287202

Reputation: 71

Different Characters returned via remote database connection utf8 than local connection

I am trying to retrieve foreign language UTF8 data via a remote mysql database connection. When I retrieve the data remotely, the utf8 doesn't appear properly in the browser. However, when I retrieve the data via a local database connection, both on the live site, and on the local testing machine, the characters appear correctly in the browser.

My remote connection is from wamp local server to the online live website.

For every page I have set:

header('Content-Type: text/html; charset=utf-8');

I've also tried to set UTF-8 meta tag. I also have UTF8 specified in .htaccess as the default charset.

It's an older website so am still using mysqli. I have also tried setting:

$mysqli->set_charset("utf8");

For example, with remote connection Français is appearing as Français.

I have no idea what to do with this. I have spent hours trying to figure it out, but to no avail. I know it's the norm to ask for code, but there is just so much code, that I can't include it all here.

Thanks!

Upvotes: 0

Views: 295

Answers (2)

user2287202
user2287202

Reputation: 71

I made it work by adding the following to the script that calls the remote database:

$mysqli->set_charset("latin1");

I don't know if it's a bit of a hack, because it still means the chars are probably not encoded or collated correctly, but it works. Thanks Wrikken for showing me the character set modifications, I can try to use those here in the future to correct things properly.

Upvotes: 0

Wrikken
Wrikken

Reputation: 70470

And your solution is: on the remote database, the data is encoded to utf8 twice, which yields incorrect results. There is no problem in your code, that database is at fault. You can fix it there (if it's a varchar, make a backup first!): convert it to latin1 first, then to binary then to utf8. An working sql fiddle to show you how is here, I'll paste the code here too in case sqlfiddle removes it somewhere in the future:

-- database column correctly defined as utf8
CREATE TABLE base (col VARCHAR(128) CHARSET utf8);
-- wrong data is entered:
INSERT INTO base SELECT UNHEX('4672616EC383C2A7616973');
-- first, convert back to latin-1, we have now proper utf-8 data, but in a latin1 column
ALTER TABLE base MODIFY COLUMN col VARCHAR(128) CHARSET latin1;
-- convert to binary first, so MySQL leaves the bytes as is without conversion
ALTER TABLE base MODIFY COLUMN col VARBINARY(128);
-- then convert to the proper character set, which will leave the bytes once again intact
ALTER TABLE base MODIFY COLUMN col VARCHAR(128) CHARSET utf8;

Upvotes: 1

Related Questions