Reputation: 5273
I have this sample code that will extract the values of each tags. And aside from that get the class name of that tag..
<?php
$doc = new DOMDocument;
$doc->loadxml( <<< eox
<tr class="calendar_row" data-eventid="42023">
<td class="date"/>
<td class="time">All Day</td>
<td class="currency">CAD</td>
<td class="impact">
<span title="Non-Economic" class="holiday"/>
</td>
<td class="event">
<span>Bank Holiday</span>
</td>
<td class="detail">
<a class="calendar_detail level1" data-level="1" title="Open Detail"/>
</td>
<td class="actual"/>
<td class="forecast"/>
<td class="previous"/>
<td class="graph"/>
</tr>
eox
);
$xpath = new DOMXPath($doc);
foreach( $xpath->query('//tr[@data-eventid="42023"]/td[@class]') as $n ) {
echo $n->nodeName.'-'.$n->nodeValue."<br />";
}
?>
using the snippet above, all i want is to get those values even if some tags arent well formatted (im scrapping a web source).. How can i do this in DOMDocument XPath Query. I am having trouble 'cause the values being fetch are:
td-
td-All Day
td-CAD
td-
td-Bank Holiday
td-
td-
td-
td-
td-
instead of:
date-
time-All Day
currency-CAD
impact-
event-Bank Holiday
detail-
actual-
forecast-
previous-
graph-
Upvotes: 1
Views: 4744
Reputation: 2034
echo $n->getAttribute("class") . '-' . $n->nodeValue . "<br />";
Upvotes: 1
Reputation: 72652
Instead of doing $n->nodeName
you should be doing this $n->getAttribute('class')
.
Demo: http://codepad.viper-7.com/ktpnv2
Upvotes: 3