Reputation: 186
So after a whole day of googling and debugging I end up here.
set to the following encoding:
db: utf8_general_ci
table: utf8_general_ci
column: utf8_general_ci, TEXT
I put in some euro symbols and some other weird characters
acentuação €€€€€
config
$config['charset'] = 'UTF-8';
dsn
char_set=utf8,dbcollat=utf8_general_ci
I made some queries to compare
$query = $this->db->query("SET NAMES latin1");
$query = $this->db->query("SELECT shortdesc,HEX(shortdesc) FROM `contracttypes` WHERE id = 4");
$ret['latin1'] = $query->row();
$query = $this->db->query("SET NAMES utf8");
$query = $this->db->query("SELECT shortdesc,HEX(shortdesc) FROM `contracttypes` WHERE id = 4");
$ret['utf8'] = $query->row();
return $ret;;
public function utfhell() {
var_dump($this->campagne_model->utfhell());
}
This outputs
array (size=2)
'latin1' =>
object(stdClass)[34]
public 'shortdesc' => string 'acentua��o �����' (length=16)
public 'HEX(shortdesc)' => string '6163656E747561C3A7C3A36F20E282ACE282ACE282ACE282ACE282AC' (length=56)
'utf8' =>
object(stdClass)[33]
public 'shortdesc' => string 'acentuação €€€€€' (length=28)
public 'HEX(shortdesc)' => string '6163656E747561C3A7C3A36F20E282ACE282ACE282ACE282ACE282AC' (length=56)
So far so good, on to a
<?php header('Content-Type: text/html; charset="utf-8"', true); ?>
<!doctype html>
<html>
<head>
<title>UTFhell</title>
<link rel="stylesheet" href="../assets/css/style.css"/>
<meta charset="utf-8">
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
...
<?php
echo 'Original : ', $campagne_info->contractName->shortdesc."<br />";
echo 'UTF8 Encode : ', utf8_encode($campagne_info->contractName->shortdesc)."<br />";
echo 'UTF8 Decode : ', utf8_decode($campagne_info->contractName->shortdesc)."<br />";
echo 'TRANSLIT : ', iconv("ISO-8859-1", "UTF-8//TRANSLIT", $campagne_info->contractName->shortdesc)."<br />";
echo 'IGNORE TRANSLIT : ', iconv("ISO-8859-1", "UTF-8//IGNORE//TRANSLIT", $campagne_info->contractName->shortdesc)."<br />";
echo 'IGNORE : ', iconv("ISO-8859-1", "UTF-8//IGNORE", $campagne_info->contractName->shortdesc)."<br />";
echo 'Plain: ', iconv("ISO-8859-1", "UTF-8", $campagne_info->contractName->shortdesc)."<br />";
echo '€€€€€€€€€€<br>';
?>
None of these now show me a normal euro symbol except the final echo statement, they all give me questionmark diamonds for the eurosymbols
Upvotes: 2
Views: 3461
Reputation: 142560
The HEX is the utf8 encoding for that string. So the data is in the table 'correctly'.
The black diamond (�) is the browser's way of saying wtf. It comes from having latin1 characters, but telling the browser to display utf8 characters.
You could tell the browser to display "Western", that is avoiding the underlying problems. Remember, the goal is to really use utf8.
Sometimes this occurs together with Question Marks, in which case you must start over.
The cause (probably):
Solution, Plan A: (Sloppy, but probably workable)
Change #5 so say the appropriate equivalent of latin1.
Solution, Plan B:
query("SET NAMES utf8")
(unless there is a way to set it at connect time)CHARACTER SET utf8
<meta ... UTF-*>
.Upvotes: 1