Patrick
Patrick

Reputation: 3367

convert UTF-8 to ANSI (windows-1252)

I'm trying to save a string in hebrew to file, while having the file ANSI encoded. All attemps failed I'm afraid.

  1. The PHP file itself is UTF-8.

So here's the code I'm trying :

$to_file = "בדיקה אם נרשם";  
$to_file = mb_convert_encoding($to_file, "WINDOWS-1255", "UTF-8");  
file_put_contents(dirname(__FILE__) ."/txt/TESTING.txt",$to_file);      

This returns false for some reason.

Another attempt was :

$to_file = iconv("UTF-8", "windows-1252", $to_file);

This returns an empty string. while this did not work, Changing the outpout charset to windows-1255 DID work. so the function itself works, But for some reason it does not convert to 1252.

I ran this function before and after the iconv and printed the results

mb_detect_encoding ($to_file);

before the iconv the encoding is UTF-8.
after the iconv the encoding is ASCII(??)

I'd really appreciate any help you can give

Upvotes: 3

Views: 25161

Answers (2)

Shlomtzion
Shlomtzion

Reputation: 718

You can use this:

<?php
$heb = 'טקסט בעברית .. # ';
$utf = preg_replace("/([\xE0-\xFA])/e","chr(215).chr(ord(\${1})-80)",$heb);
echo '<pre>';
print_r($heb);
echo '<pre>';
echo '------';
echo '<pre>';
print_r($utf);
echo '<pre>';
?>

Output will be like this:

���� ������ .. # <-- $heb - what we get when we print hebrew ANSI Windows 1255

טקסט בעברית .. # <- $utf - The Converted ANSI Windows 1255 to now UTF ...:)

Upvotes: 0

deceze
deceze

Reputation: 522076

Windows-1252 is a Latin encoding; you cannot encode Hebrew characters in Windows-1252. That's why it doesn't work.
Windows-1255 is an encoding for Hebrew, that's why it works.

The reason it doesn't work with mb_convert_encoding is that mb_ doesn't support Windows-1255.

Detecting encodings is by definition impossible. Windows-1255 is a single-byte encoding; it's virtually impossible to distinguish any one single byte encoding from another. The result is just as valid in ASCII as it is in Windows-1255 or Windows-1252 or ISO-8859 or any other single byte encoding.

See What Every Programmer Absolutely, Positively Needs To Know About Encodings And Character Sets To Work With Text for more information.

Upvotes: 5

Related Questions