Reputation: 111
I use the following PHP to remove items from an XML I own if they are over 8 days old. It had worked fine once before but now gives me the error message
Call to a member function removeChild() on a non-object in /Users//DateTest-3.php on line 40
Line 40 is
$node->parentNode->removeChild($node);
Any ideas why this is throwing the error?
<?php
$rss = new DOMDocument();
$url = 'http://URL.com/Test.xml';
$rss->load($url);
$feed = array();
foreach ($rss->getElementsByTagName('item') as $node) {
$item = array (
'title' => $node->getElementsByTagName('title')->item(0)->nodeValue,
'desc' => $node->getElementsByTagName('description')->item(0)->nodeValue,
'link' => $node->getElementsByTagName('link')->item(0)->nodeValue,
'date' => $node->getElementsByTagName('date')->item(0)->nodeValue,
);
array_push($feed, $item);
}
$limit = 50;
for ($i = 0; $i < count($feed); $i++) {
date_default_timezone_set('America/Los_Angeles');
$newDate = strtotime("-8 day");
$date = strtotime($feed[$i]['date']);
if ($date > $newDate) {
echo "Don't delete";
} else {
echo "Delete";
$node->parentNode->removeChild($node);
}
}
$rss->save("Test.xml")
?>
Upvotes: 1
Views: 393
Reputation: 41737
In RSS 1.0 there is no 'date' on items. But 'dc:date' comes into play. http://web.resource.org/rss/1.0/spec#s5.5
In RSS 2.0 there is no 'date', but 'pubdate' on items. http://cyber.law.harvard.edu/rss/rss.html#hrelementsOfLtitemgt
Decide, if you want to look for 'date', 'dc:date' and 'pubDate'. The following code works with pubDate.
$limit = 50;
was unused
Removing nodes from a nodeList under iteration will not work. It's an old hat! See comments here: http://php.net/manual/de/domnode.removechild.php The solution is to use a queue for marking the bad nodes and remove them afterwards.
I have taken the liberty to mangle the code a bit. I left the debug stuff intentionally active. Mainly for date comparison stuff and reduced list display. The code is commented.
Please adjust the feed URL and the "-x days" in the condition. I had to work with a public rss feed to test things.
--
<?php
date_default_timezone_set('America/Los_Angeles');
$feed = array(); // target array for filtered items
$nodesToRemoveQueue = array(); // stores all nodes to remove
$rss = new DOMDocument();
$url = 'http://rss.nytimes.com/services/xml/rss/nyt/Space.xml';
$rss->load($url);
$nodeList = $rss->getElementsByTagName('item');
foreach ($nodeList as $node)
{
$pubDate = $node->getElementsByTagName('pubDate')->item(0)->nodeValue;
// if date in the xml feed is older then desired number of days, remove node
// and proceed with iteration. (do not transfer the data into the $feeds array.)
if(isDateOlderThenDays($pubDate, '-5 days')) {
echo 'Removed ' . $pubDate . '<br>';
// $node->parentNode->removeChild($node); this won't work!!
$nodesToRemoveQueue[] = $node; // put node in queue, remove later
continue;
}
echo 'Kept ' . $pubDate . '<br>';
// build item for $feed array, then add item to $feed array
$item = array (
'title' => $node->getElementsByTagName('title')->item(0)->nodeValue,
'desc' => $node->getElementsByTagName('description')->item(0)->nodeValue,
'link' => $node->getElementsByTagName('link')->item(0)->nodeValue,
'date' => $pubDate,
);
$feed[] = $item;
}
// helper to compare dates -
function isDateOlderThenDays($date, $days)
{
// when pubdate($date) is lower(older) then $days, return true, else false.
return (strtotime($date) < strtotime($days)) ? true : false;
}
// feed array contains all the not "outdated" items
var_dump($feed);
// finally: remove the "outdated" nodes
foreach($nodesToRemoveQueue as $node){
$node->parentNode->removeChild($node);
}
// nodelist reduction check. this should only displays the dates kept
$nodeList = $rss->getElementsByTagName('item');
foreach ($nodeList as $node) {
echo $node->getElementsByTagName('pubDate')->item(0)->nodeValue . '<br>';
}
// write reduced RSS XML to file
$rss->save(__DIR__.'/Test.xml');
Another way of saving the XML is:
$xmlString = $rss->saveXML();
file_put_contents(__DIR__.'/Test.xml', $xmlString);
Upvotes: 1
Reputation: 622
In your second foreach
, reassign $node
on every iteration. E.g. $node = $feed[$i]
.
Upvotes: 0
Reputation: 622
Is it on purpose that you only work on the last node after the
foreach ($rss->getElementsByTagName('item') as $node)
Because $node
is kept with the last $rss->getElementsByTagName('item')
assignment.
Or is code missing?
Upvotes: 0