fozuse
fozuse

Reputation: 784

Can't load the XML file?

http://westwood-backup.com/podcast?categoryID2=403

This is the XML file that i want to load and echo via PHP. I tried file_get_contents and load. Both of are return empty string. If i change the URL as another XML file, functions works great. What can be special about the URL?

<?php 
$content = file_get_contents("http://westwood-backup.com/podcast?categoryID2=403");
echo $content;
?>

Another try with load, same empty result.

<?php 
$feed = new DOMDocument();
if (@$feed->load("http://westwood-backup.com/podcast?categoryID2=403")) { 
    $xpath = new DOMXpath($feed);
    $linkPath = $xpath->query("/rss/channel/link");
    echo $linkPath
}
?>

Upvotes: 0

Views: 1658

Answers (2)

Steve
Steve

Reputation: 20459

You need to set a useragent header that the server is happy with. No need for cUrl if you dont want to use it, you can use stream_context_create with file_get_contents:

$options = array(
        'http'=>array(
            'method'=>"GET",
            'header'=>"Accept-language: en\r\n" .
                "User-Agent: Mozilla/5.0 (iPad; U; CPU OS 3_2 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Version/4.0.4 Mobile/7B334b Safari/531.21.102011-10-16 20:23:10\r\n" // i.e. An iPad
        )
    );

$context = stream_context_create($options);
$content = file_get_contents("http://westwood-backup.com/podcast?categoryID2=403", false, $context);
echo $content;

Upvotes: 1

Latheesan
Latheesan

Reputation: 24116

Use CURL and you can do it like this:

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,'http://westwood-backup.com/podcast?categoryID2=403');
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 2);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, ' Mozilla/1.22 (compatible; MSIE 2.0d; Windows NT)');
$xml = curl_exec($ch);
curl_close($ch);

$xml = new SimpleXMLElement($xml);
echo "<pre>";
print_r($xml);
echo "</pre>";

Outputs:

enter image description here


I think the server implements a "User-Agent" check to make sure the XML data is only loaded within a browser (not via bots/file_get_contents etc...)

so, by using CURL and setting a dummy user-agent, you can get around the check and load the data.

Upvotes: 2

Related Questions