Reputation: 125
I'm trying to parse RSS: http://www.mlssoccer.com/rss/en.xml .
$feed = new DOMDocument();
$feed->load($url)
$items = $feed->getElementsByTagName('channel')->item(0)->getElementsByTagName('item');
foreach($items as $key => $item)
{
$title = $item->getElementsByTagName('title')->item(0)->firstChild->nodeValue;
$pubDate = $item->getElementsByTagName('pubDate')->item(0)->firstChild->nodeValue;
$description = $item->getElementsByTagName('description')->item(0)->firstChild->nodeValue;
// do some stuff
}
The thing is: I'm getting "$title" and "$pubDate" without a problem, but for some reason "$description" is always empty, there's nothing in it. What could be the reason for such behaviour and how to fix it?
Upvotes: 0
Views: 194
Reputation: 19482
Here can be whitespaces between the opening <description>
tag and the opening <![CDATA[
. This is a text node.
So if you access the firstChild of description
, you might fetch that whitespace text node.
In a generic way you can set the DOMdocument to ignore whitespace nodes:
$feed = new DOMDocument();
$feed->preserveWhiteSpace = FALSE;
$feed->load($url);
Additionally you should check out XPath, it makes reading a DOM much easier:
$xpath = new DOMXpath($feed);
foreach ($xpath->evaluate('//channel/item') as $item) {
$title = $xpath->evaluate('string(title)', $item);
$pubDate = $xpath->evaluate('string(pubDate)', $item);
$description = $xpath->evaluate('string(description)', $item);
// do some stuff
var_dump([$title, $pubData, $description]);
}
Upvotes: 1
Reputation: 983
The problem was with CDATA you need to use textContent instead of nodeValue to retreive value beetween
<?php
$feed = new DOMDocument();
$feed->load('http://www.mlssoccer.com/rss/en.xml');
$items = $feed->getElementsByTagName('channel')->item(0)->getElementsByTagName('item');
foreach($items as $key => $item)
{
$title = $item->getElementsByTagName('title')->item(0)->firstChild->nodeValue;
$pubDate = $item->getElementsByTagName('pubDate')->item(0)->firstChild->nodeValue;
$description = $item->getElementsByTagName('description')->item(0)->textContent; // textContent
}
Upvotes: 3