Reputation: 2237
Ok I'm stumped here how can I remove a parent node and replace it with its child?
My goal here is to remove outbound links from images. I do not want to remove normal links fromt he document just remove the ones making an image into a link while keeping the image intact. Example:
<a href="http://www.w3schools.com"><img src="logo_w3s.gif"></a>
Should be replaced and become:
<img src="logo_w3s.gif">
Here's my code that doesn't work but I feel is getting close:
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(maintext);
dynamic allimages = doc.DocumentNode.Descendants("img").ToList;
if (scrapeimages.Checked) {
//the user does want images scraped. Remove image outbound links
try {
foreach (void n_loopVariable in allimages) {
n = n_loopVariable;
if (n.ParentNode.Name == "a") {
dynamic outer = n.OuterHtml;
dynamic newnode = HtmlNode.CreateNode(outer);
n.ParentNode.ReplaceChild(n.ParentNode, newnode);
}
}
maintext = doc.DocumentNode.OuterHtml;
} catch {
}
}
Upvotes: 1
Views: 1627
Reputation: 18127
var node = doc.DocumentNode.SelectSingleNode(yourANode);
node.ParentNode.RemoveChild(node, true);
Something like this should help, this will remove Child of the parent node of your <a>
, but it will keep grandChildren. This true parameter in RemoveChild
indicates keepGrandChild
.
If all <img>
have <a>
var nodeList = doc.DocumentNode.SelectNodes("img");
for(HtmlNode node in nodeList)
{
var parentATagNode = node.Parent.Parent;
parentATagNode.RemoveChild(node.Parent, true);
}
Upvotes: 3