Reputation: 66123
I have been stumped by an issue. It seems that most of the tricks that I have tried simply do not work. An overview of the problem is as follow:
Create table, collation set to utf8_unicode_ci. Same for columns.
Page where form is located in has a character encoding of UTF-8 (in the <meta>
tag). Form is set to accept character set of UTF-8 (<form action="execute.php" method="POST" accept-charset="utf-8">
)
Execute.php sanitizies form input using htmlspecialchars(@trim($str), ENT_QUOTES, "UTF-8");
and also runs mysql_real_escape_string($str);
. Declare database connection should be encoded in UTF-8 (mysql_set_charset('utf-8');
). Insert values into db. If I halt the database insert and echo the query, I get normal looking output.
Now the fun begins. MySQL rows display odd characters, e.g. ß turning into ß.
If I retrieve the database data and present it on a page with UTF-8 encoding, the characters look jumbled (ß), too. However, when I change the page encoding to Western ISO, the character display just fine - ß.
I am suspecting that there is a problem when the form submits the data to the database... but I can't pinpoint where exactly went wrong.
Upvotes: 0
Views: 341
Reputation: 71384
You need the character set of the database table and columns set to UTF-8 as well. Collation only deals with how data will be sorted/compared not how it is encoded.
Upvotes: 0
Reputation: 15374
A few things
htmlspecialchars
or any sanitation. Validate input, but store it as-is if it's valid.htmlspecialchars
.SET NAMES utf8
after you initialized your database connection with mysqli.header('Content-type: text/html; charset=utf-8');
You can do this from anywhere in your code if you're using output buffering.<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
to page <head>
.Upvotes: 3