Reputation: 729
I have html content like the following...
<table>
<tr>
<td>xyx...</td>
<td>abc....</td>
<td><span><h3>Downloads</h3></span><br>blah blah blah...</td>
</tr>
<tr>
<td><h3>Downloads</h3>again some content.</td>
<td>dddd</td>
<td>kkkl...</td>
</tr>
</table>
Now am trying to delete 'td's if it has the word 'Downloads' anywhere in the content. After some research on internet I can get something executed and the code is as follows...
$res_text = 'MY HTML';
# Create a DOM parser object
$dom = new DOMDocument();
# Parse the HTML from Google.
# The @ before the method call suppresses any warnings that
# loadHTML might throw because of invalid HTML in the page.
@$dom->loadHTML($res_text);
$selector = new DOMXPath($dom);
$results = $selector->query('//*[text()[contains(.,"Downloads")]]');
if($results->length){
foreach($results as $res){
$res->parentNode->removeChild($res);
}
}
This does deletes the word 'Downloads' with its current parent node <span>
or <p>
, but I wanted the whole <td>
should be deleted along with the content.
I tried...
$results = $selector->query('//td[text()[contains(.,"Downloads")]]');
but it's not working. Can some one tell me how can I get it?
Upvotes: 2
Views: 91
Reputation: 11665
You don't need the text()
in your query, it should be:
$results = $selector->query('//td[contains(.,"Downloads")]');
The whole code:
$dom = new DOMDocument();
$dom->loadHTML($res_text);
$selector = new DOMXPath($dom);
$results = $selector->query('//td[contains(.,"Downloads")]');
if($results->length){
foreach($results as $res){
$res->parentNode->removeChild($res);
}
}
echo htmlentities($dom->saveHTML());
Upvotes: 2