Reputation: 12882
When loading HTML into an <textarea>
, I intend to treat different kinds of links differently. Consider the following links:
<a href="http://stackoverflow.com">http://stackoverflow.com</a>
<a href="http://stackoverflow.com">StackOverflow</a>
When the text inside a link matches its href
attribute, I want to remove the HTML, otherwise the HTML remains unchanged.
Here's my code:
$body = "Some HTML with a <a href=\"http://stackoverflow.com\">http://stackoverflow.com</a>";
$dom = new DOMDocument;
$dom->loadHTML($body, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
foreach ($dom->getElementsByTagName('a') as $node) {
$link_text = $node->ownerDocument->saveHTML($node->childNodes[0]);
$link_href = $node->getAttribute("href");
$link_node = $dom->createTextNode($link_href);
$node->parentNode->replaceChild($link_node, $node);
}
$html = $dom->saveHTML();
The problem with the above code is that DOMDocument
encapsulates my HTML into a paragraph tag:
<p>Some HTML with a http://stackoverflow.com</p>
How do I get it ot only return the inner HTML of that paragraph?
Upvotes: 3
Views: 2871
Reputation: 19780
You need to have a root node to have a valid DOM document.
I suggest you to add a root node <div>
to avoid to destroy a possibly existing one.
Finally, load the nodeValue
of the rootNode or substr()
.
$body = "Some HTML with a <a href=\"http://stackoverflow.com\">http://stackoverflow.com</a>";
$body = '<div>'.$body.'</div>';
$dom = new DOMDocument;
$dom->loadHTML($body, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
foreach ($dom->getElementsByTagName('a') as $node) {
$link_text = $node->ownerDocument->saveHTML($node->childNodes[0]);
$link_href = $node->getAttribute("href");
$link_node = $dom->createTextNode($link_href);
$node->parentNode->replaceChild($link_node, $node);
}
// or probably better :
$html = $dom->saveHTML() ;
$html = substr($html,5,-7); // remove <div>
var_dump($html); // "Some HTML with a http://stackoverflow.com"
This works is the input string is :
<p>Some HTML with a <a href=\"http://stackoverflow.com\">http://stackoverflow.com</a></p>
outputs :
<p>Some HTML with a http://stackoverflow.com</p>
Upvotes: 1