karto
karto

Reputation: 3658

php download xml page and convert to utf-8

when I right-click on the xml page in the browser and save AS , and open it with Notepad++ it appears OK with the non english characters. However if i write a script to save the page to my server, I have issues with character encoding. This is really a headache. Any help? thanks.

function download_page($path)
 {
//$path = htmlentities($path);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$path);
curl_setopt($ch, CURLOPT_FAILONERROR,1);
    //curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_TIMEOUT, 280);
$retValue = curl_exec($ch);  
if (!$retValue){ //echo "erro curl";
        }                    

@curl_close($ch);
return $retValue;
 } 

 $file= download_page($url);
 $file = mb_convert_encoding($file, 'HTML-ENTITIES', "UTF-8");
 $file = utf8_encode ($file);

Upvotes: 1

Views: 733

Answers (1)

ttamas
ttamas

Reputation: 192

Your code suggests that the result is encoded in UTF-8. First, are you sure it is true? And why do you need to convert it twice (first to 'HTML-ENTITIES', than back to UTF-8)? If you just want to have html entities, use the htmlentities() function.

Upvotes: 1

Related Questions