Reputation: 1773
I'm trying to get the HTML markup of a table in a page:
$new_dom = new DOMDocument();
$table = '';
$nodesTable = $this->dom->getElementsbyTagName("table");
foreach($nodesTable as $nodeTable){
$color = $nodeTable->getAttribute('bordercolordark');
if ($color == '#73BAFF') {
$table = $nodeTable;
}
}
$new_dom->appendChild($table);
echo $new_dom->saveHTML();
Here is somepage.html:
<html>
<table>
<tr> <td> 10 </td> </tr>
<tr> <td> 10 </td> </tr>
<tr> <td> 10 </td> </tr>
<tr> <td> 10 </td> </tr>
</table>
<table border="1" cellpadding="0" width="500" bordercolorlight="#ACD6FF" bordercolordark="#73BAFF" align="center">
<tr>
<td rowspan="2" colspan="2" bgcolor="#73BAFF"> </td>
<td colspan="3" align="center" bgcolor="#ACD6FF"> Element 1 </td>
<td colspan="3" align="center" bgcolor="#ACD6FF"> Element 2 </td>
</tr>
<tr>
<td width="50" align="center" bgcolor="#ACD6FF"> 50 </td>
<td width="50" align="center" bgcolor="#ACD6FF"> 50 </td>
<td width="50" align="center" bgcolor="#ACD6FF"> 50 </td>
<td width="50" align="center" bgcolor="#ACD6FF"> 50 </td>
<td width="50" align="center" bgcolor="#ACD6FF"> 50 </td>
<td width="50" align="center" bgcolor="#ACD6FF"> 50 </td>
</tr>
<tr>
<td bgcolor="#ACD6FF" width="155" align="center"> Row 1</td>
<td bgcolor="#ACD6FF" width="45" align="center"> 30 </td>
<td align="center"> 50 </td>
<td align="center"> 50 </td>
<td align="center"> 50 </td>
<td align="center"> 50 </td>
<td align="center"> 50 </td>
<td align="center"> 50 </td>
</tr>
<tr>
<td bgcolor="#ACD6FF" width="155" align="center"> Row 2</td>
<td bgcolor="#ACD6FF" width="45" align="center"> 30 </td>
<td align="center"> 60 </td>
<td align="center"> 60 </td>
<td align="center"> 60 </td>
<td align="center"> 60 </td>
<td align="center"> 60 </td>
<td align="center"> 60 </td>
</tr>
<tr>
<td bgcolor="#ACD6FF" width="155" align="center"> Row 3</td>
<td bgcolor="#ACD6FF" width="45" align="center"> 30 </td>
<td align="center"> 70 </td>
<td align="center"> 70 </td>
<td align="center"> 70 </td>
<td align="center"> 70 </td>
<td align="center"> 70 </td>
<td align="center"> 70 </td>
</tr>
</table>
<table>
<tr> <td> 10 </td> </tr>
<tr> <td> 10 </td> </tr>
<tr> <td> 10 </td> </tr>
<tr> <td> 10 </td> </tr>
</table>
<table>
<tr> <td> 10 </td> </tr>
<tr> <td> 10 </td> </tr>
<tr> <td> 10 </td> </tr>
<tr> <td> 10 </td> </tr>
</table>
</html>
$new_dom just outputs \n instead of HTML markup. I tried looking at this answer: PHP DOMDocument stripping HTML tags, but appending the table this way didn't work either.
Upvotes: 0
Views: 107
Reputation: 26139
Fatal error: Uncaught exception 'DOMException' with message 'Wrong Document Error'
So you cannot move nodes from one document to another... If you want to do that, you have to use importNode() with the deep
flag.
$dom = new DOMDocument();
$dom->loadHTMLFile('x.html');
$new_dom = new DOMDocument();
$table = '';
$nodesTable = $dom->getElementsbyTagName("table");
foreach($nodesTable as $nodeTable){
$color = $nodeTable->getAttribute('bordercolordark');
if ($color == '#73BAFF') {
$table = $new_dom->importNode($nodeTable, true);
}
}
$new_dom->appendChild($table);
echo $new_dom->saveHTML();
This imports only the table element, but not the children...
note: I'd disable the entity loader in your case libxml_disable_entity_loader(true);
. I am not sure whether XEE attacks work with loadHTML()
too, but just for the sake of security.
Upvotes: 2