fpilee
fpilee

Reputation: 2098

Can't load youtube source in php script.

Hello I'm making a php script to extract videos URL from Youtube result. I have this:

<?php
    error_reporting(1);

    function conseguir_codigo_url($url) {
        $dwnld = curl_init();
        curl_setopt($dwnld, CURLOPT_URL, $url);
        curl_setopt($dwnld, CURLOPT_HEADER, 0);
        //$userAgent = 'Mozilla/4.0 (compatible; MSIE 6.01; Windows NT 6.0)';
        curl_setopt($dwnld, CURLOPT_USERAGENT, $userAgent);
        curl_setopt($dwnld, CURLOPT_RETURNTRANSFER, true);

        $fuente_url = curl_exec($dwnld);
        curl_close($dwnld);
        return $fuente_url;
    }

    function extraer_atributo_elemento($fuente) {
        $file = new DOMDocument;

        if($file->loadHTML($fuente) and $file->validate()){

            echo "DOCUMENTO";

            $file->getElementById("search-results");

        }

     $codigo_url = conseguir_codigo_url("http://www.youtube.com/results?search_sort=video_date_uploaded&uni=3&search_type=videos&search_query=humor");
    extraer_atributo_elemento($codigo_url);
?>

The trouble is I can't use getelementbyid, I think it's maybe html5. Have you a suggestions to solve this. I need parse the source and I don't know regex . So domdocument is the only way.

Upvotes: 0

Views: 93

Answers (1)

seb
seb

Reputation: 2385

Why do you use $file->validate()? If you just want to extract an element by ID, no need to call this. Also, setting DOMDocument::recover to true before calling loadHTML may help to parse broken HTML from the net.

Upvotes: 1

Related Questions