vaanipala
vaanipala

Reputation: 1291

rss feed xml: cannot access images after converting rss feed to xml object

In the http://feeds.feedburner.com/rb286, there are many images. However, when i convert it into and xml object with simplXmlElement, i'm not able to see the images.My code:

if (function_exists("curl_init")){
$ch=curl_init();
curl_setopt($ch,CURLOPT_URL,"http://feeds.feedburner.com/rb286");
curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
$data=curl_exec($ch);
curl_close($ch);
//print_r($data);   //here i'm able to see the images
     $doc=new SimpleXmlElement($data);
     print_r($doc);   //here i'm not able to see the images
  }

Can someone tell me on how can I access the images after converting to xml object? thank you.

Upvotes: 0

Views: 635

Answers (1)

complex857
complex857

Reputation: 20753

You will have to iterate trough the <content:encoded> tags of the individual <items> in the <channel> main tag. I would use the xpath method to select the tags. Once you get the element you want you can grep the <img> out of them with string manipulation tools like preg_match_all:

Edit: added more refined image tag matching, that excludes ads from feedburner and other cdns.

$xml = simplexml_load_string(file_get_contents("http://feeds.feedburner.com/rb286"));

foreach ($xml->xpath('//item/content:encoded') as $desc) {
    preg_match_all('!(?<imgs><img.+?src=[\'"].*?http://feeds.feedburner.com.+?[\'"].+?>)!m', $desc, $>

    foreach ($m['imgs'] as $img) {
        print $img;
    }
}

The <content:encoded> tag is namespaced, so if you want to use simplexml's built in property mapping, you have to deal with it like this:

// obtain simplexml object of the feed as before
foreach ($xml->channel->item as $item) {
    $namespaces = $item->getNameSpaces(true);
    $content = $item->children($namespaces['content']);
    print $content->encoded; // use it howevery you want
}

You can read more from the xpath query language here.

Upvotes: 2

Related Questions