Reputation: 1
I have an UCS-2 text files. Now, I want to read this text file as an UTF-8 string. I have used this code for doing it.
my_code.php:
<?php
error_reporting(0);
header('Content-Type: text/html; charset=utf-8');
echo '<form enctype="multipart/form-data" method="post"><p><input type="file" name="my_file" /> <input type="submit" value="+" /><hr />';
$my_str = file_get_contents(($_FILES['my_file']['tmp_name']));
echo $my_str;
?>
viet_test.txt:
"Vietnamese" is "Tiếng Việt".
But, it returns wrong: ��"Vietnamese" is "Ti�ng Vi�t".
. There is what I am looing for: "Vietnamese" is "Tiếng Việt"
(in UTF-8).
Can you tell me: "What is wrong in my code? And, how to fix it?".
I am sorry, I am not very professional in PHP.
Upvotes: 0
Views: 3156
Reputation: 522091
You cannot read the file "as UTF-8". It contains UCS-2, so reading it you'll read a UCS-2 string. You can however convert the read UCS-2 string to UTF-8:
$my_str = file_get_contents($_FILES['my_file']['tmp_name']);
$my_str = mb_convert_encoding($my_str, 'UTF-8', 'UCS-2');
echo $my_str;
Note that you might have to use UCS-2BE
or UCS-2LE
explicitly.
If that still returns "nothing", you have a different problem than anything to do with encodings.
Upvotes: 1