Reputation: 2585
$url = file_get_contents('test.html');
$DOM = new DOMDocument();
$DOM->loadHTML(mb_convert_encoding($url, 'HTML-ENTITIES', 'UTF-8'));
$trs = $DOM->getElementsByTagName('tr');
foreach ($trs as $tr) {
foreach ($tr->childNodes as $td){
echo ' ' .$td->nodeValue;
}
}
test.html
<html>
<body>
<table>
<tbody>
<tr>
<td style="background-color: #FFFF80;">1</td>
<td><a href="test1.php" title="test1">test1</a></td>
</tr>
<tr>
<td style="background-color: #FFFF80;">2</td>
<td><a href="test2.php" title="test2">test2</a></td>
</tr>
<tr>
<td style="background-color: #FFFF80;">3</td>
<td><a href="test3.php" title="test3">test3</a></td>
</tr>
</tbody>
</table>
</body>
</html>
in result i get:
1 test1 2 test2 3 test3
But how get link from td a
?
And how get html from td
?
P.S.: i try with $td->find('a');
and $td->getElementsByTagName('a');
but it not work...
Upvotes: 0
Views: 248
Reputation: 1597
I improved your code a little bit and this version works fine for me:
$DOM = new DOMDocument();
$DOM->loadHTML(mb_convert_encoding($url, 'HTML-ENTITIES', 'UTF-8'));
$trs = $DOM->getElementsByTagName('tr');
foreach ($trs as $tr) {
foreach ($tr->childNodes as $td){
if ($td->hasChildNodes()) { //check if <td> has childnodes
foreach($td->childNodes as $i) {
if ($i->hasAttributes()){ //check if childnode has attributes
echo $i->getAttribute("href") . "\n"; // get href="" attribute
}
}
}
}
}
Result:
test1.php
test2.php
test3.php
Upvotes: 2