Linto
Linto

Reputation: 1282

Problems on reading image url from a rss feed, using DOMDocument

I have a rss feed

<rss xmlns:media="http://search.yahoo.com/mrss/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">  

<item> 
      <title>VIDEO: Have you heard of Alibaba?</title>  
      <description>Alibaba is the world's biggest e-commerce firm but most people in the West haven't heard of it.</description>  
      <link>http://www.bbc.co.uk/news/business-29216696#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
      <guid isPermaLink="false">http://www.bbc.co.uk/news/business-29216696</guid>  
      <pubDate>Tue, 16 Sep 2014 02:29:17 GMT</pubDate>  
      <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/77609000/jpg/_77609399_73619721.jpg"/>  
      <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/77609000/jpg/_77609400_73619721.jpg"/> 
    </item>  
    <item> 
      <title>VIDEO: Phones 4U shops closing for business</title>  
      <description>Retailer Phones 4U has gone into administration putting 5,596 jobs at risk.</description>  
      <link>http://www.bbc.co.uk/news/business-29202179#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
      <guid isPermaLink="false">http://www.bbc.co.uk/news/business-29202179</guid>  
      <pubDate>Mon, 15 Sep 2014 22:15:50 GMT</pubDate>  
      <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/77587000/jpg/_77587217_77587209.jpg"/>  
      <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/77587000/jpg/_77587218_77587209.jpg"/> 
    </item> 
</rss>

I am able to read title, description from this rss, using php's DOMDocument class.

Following is my code

$xml = 'http://feeds.bbci.co.uk/news/video_and_audio/business/rss.xml' ;
$xmlDoc = new DOMDocument();
$xmlDoc->load($xml);
$items=$xmlDoc->getElementsByTagName('item');
foreach($items as $item){
$item_title= $item->getElementsByTagName('title')->item(0)->childNodes->item(0)->nodeValue;
$item_link= $item->getElementsByTagName('link')->item(0)->childNodes->item(0)->nodeValue;
$item_desc= $item->getElementsByTagName('description')->item(0)->childNodes->item(0)->nodeValue;

}

But how can able to read url of 'media:thumbnail' tag of each item ?

Upvotes: 2

Views: 1787

Answers (2)

ThW
ThW

Reputation: 19492

Use Xpath. It is part of the DOM extension and allows you to use expressions to fetch nodes and values from a DOM. Like XML itself Xpath allows you define prefixes/aliases for the namespaces.

$dom = new DOMDocument;
$dom->loadXml($xml);
$xpath = new DOMXpath($dom);
$xpath->registerNamespace('m', 'http://search.yahoo.com/mrss/');
$xpath->registerNamespace('a', 'http://www.w3.org/2005/Atom');

foreach ($xpath->evaluate('//item') as $itemNode) {
  $item = [
    'title' => $xpath->evaluate('string(title)', $itemNode),
    'link' => $xpath->evaluate('string(link)', $itemNode),
    'description' => $xpath->evaluate('string(description)', $itemNode),
  ];
  foreach ($xpath->evaluate('m:thumbnail/@url', $itemNode) as $urlAttribute) {
    $item['thumbnails'][] = $urlAttribute->value;
  }  
  var_dump($item);
}

Upvotes: 0

Kevin
Kevin

Reputation: 41885

Since it has namespaces, use getElementsByTagNameNS() together with ->getAttribute() in this case. Example:

$xml = 'http://feeds.bbci.co.uk/news/video_and_audio/business/rss.xml' ;
$xmlDoc = new DOMDocument();
$xmlDoc->load($xml);
$items = $xmlDoc->getElementsByTagName('item');
foreach($items as $key => $item) {
    $item_title= $item->getElementsByTagName('title')->item(0)->childNodes->item(0)->nodeValue;
    $item_link= $item->getElementsByTagName('link')->item(0)->childNodes->item(0)->nodeValue;
    $item_desc= $item->getElementsByTagName('description')->item(0)->childNodes->item(0)->nodeValue;
    $media = $item->getElementsByTagNameNS('http://search.yahoo.com/mrss/', 'thumbnail');
    foreach($media as $thumb) {
        echo $thumb->getAttribute('url') . '<br/>';
    }
}

SimpleXMLElement Variant:

$xml = simplexml_load_file('http://feeds.bbci.co.uk/news/video_and_audio/business/rss.xml');
foreach($xml->channel->item as $item) {
    $title = $item->title;
    $description = $item->description;
    $link = $item->link;
    $media = $item->children('media', 'http://search.yahoo.com/mrss/');
    foreach($media->thumbnail as $thumb) {
        echo $thumb->attributes()->url . '<br/>';
    }
}

Upvotes: 2

Related Questions