ahmedavid
ahmedavid

Reputation: 35

Can't figure out with character encoding in PHP

I have put together little utility for reading youtube video tags. http://www.daviddresden.com/tagreader/

<?php
header("Content-Type: application/json");
error_reporting(E_ERROR | E_PARSE);
$_POST['fn']='https://www.youtube.com/watch?v=OgAt8Ehg0eo';
if(isset($_POST['fn']) && $_POST['fn'] != ''){
    $url = htmlentities($_POST['fn']);
    $page_content = file_get_contents('https://www.youtube.com/watch?v=OgAt8Ehg0eo');


    $dom_obj = new DOMDocument();
    if($dom_obj->loadHTML($page_content)){

        $dom_obj->loadHTML($page_content);
        $meta_val = '';

        foreach($dom_obj->getElementsByTagName('meta') as $meta) {

            if($meta->getAttribute('property')=='og:video:tag'){ 

                $meta_val = $meta_val.','.$meta->getAttribute('content');
            }
        }
        echo substr($meta_val,1);
    }
    else{
        echo "Invalid Url!";
    }
}
else{
    echo "Empty Url!";
}
?>

It works for ASCI characters but UTF characters show unreadable. I can't find the problem.

Upvotes: 2

Views: 24

Answers (1)

Pedro Lobito
Pedro Lobito

Reputation: 99001

utf8_decode

Converts a string with ISO-8859-1 characters encoded with UTF-8 to single-byte ISO-8859-1


Use utf8_decode to output:

echo utf8_decode(substr($meta_val,1)) ;

Set the Content-Type to utf-8

header('Content-Type: text/html; charset=utf-8');

Full code:

header('Content-Type: text/html; charset=utf-8');
$_POST['fn']='https://www.youtube.com/watch?v=OgAt8Ehg0eo';
if(isset($_POST['fn']) && $_POST['fn'] != ''){
    $url = htmlentities($_POST['fn']);
    $page_content = file_get_contents('https://www.youtube.com/watch?v=OgAt8Ehg0eo');


    $dom_obj = new DOMDocument();
    if($dom_obj->loadHTML($page_content)){

        $dom_obj->loadHTML($page_content);
        $meta_val = '';

        foreach($dom_obj->getElementsByTagName('meta') as $meta) {

            if($meta->getAttribute('property')=='og:video:tag'){ 

                $meta_val = $meta_val.','.$meta->getAttribute('content');
            }
        }
        echo utf8_decode(substr($meta_val,1)) ;
    }
    else{
        echo "Invalid Url!";
    }
}
else{
    echo "Empty Url!";
}

Upvotes: 1

Related Questions