Reputation: 1580
Let me explain situation closer.
<!DOCTYPE html>
<html lang="de">
<head>
<meta charset="utf-8">
I read some theory about character encoding, history of character encoding and almighty UTF-8 that will solve all your problems, which is simply not true. What could be wrong?
Upvotes: 2
Views: 4425
Reputation: 1580
Well, I found solution:
function decode($string){
$string = urlencode($string);
$string = str_replace('%DF','ß',$string);
$string = str_replace('%E4','ä',$string);
$string = str_replace('%F6','ö',$string);
$string = str_replace('%2B','+',$string);
$string = str_replace('%FC','ü',$string);
$string = str_replace('%26','&',$string);
$string = str_replace('%2F','/',$string);
$string = str_replace('%0A','',$string);
$string = str_replace('%0D','',$string);
$string = str_replace('%40','@',$string);
$string = str_replace('%2C',',',$string);
$string = str_replace('%E1','á',$string);
$string = str_replace('%D3','ó',$string);
$string = str_replace('+',' ',$string);
return $string;
}
But isn't there any better solution?
Upvotes: 1
Reputation: 70853
First, identify what byte values your broken characters have exactly. Without knowing you cannot identify the encoding like to be used.
echo urlencode($string_with_umlauts);
This will print all non-ascii characters as percent-encoded hex values. Note that this function is meant for some other purpose, but it'll help in this case also.
Then lookup the bytes in encoding tables like Wikipedia and be sure what you have there.
The last step: Add a transformation layer to your database access logic that converts from the encoding you saw to UTF-8 with iconv functions.
Upvotes: 1