Zach Johnson
Zach Johnson

Reputation: 2237

Remove parent node but keep child node htmlagility pack?

Ok I'm stumped here how can I remove a parent node and replace it with its child?

My goal here is to remove outbound links from images. I do not want to remove normal links fromt he document just remove the ones making an image into a link while keeping the image intact. Example:

<a href="http://www.w3schools.com"><img src="logo_w3s.gif"></a>

Should be replaced and become:

<img src="logo_w3s.gif">

Here's my code that doesn't work but I feel is getting close:

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(maintext);
dynamic allimages = doc.DocumentNode.Descendants("img").ToList;

if (scrapeimages.Checked) {
    //the user does want images scraped. Remove image outbound links
    try {
        foreach (void n_loopVariable in allimages) {
            n = n_loopVariable;
            if (n.ParentNode.Name == "a") {
                dynamic outer = n.OuterHtml;
                dynamic newnode = HtmlNode.CreateNode(outer);

                n.ParentNode.ReplaceChild(n.ParentNode, newnode);

            }
        }
        maintext = doc.DocumentNode.OuterHtml;
    } catch {
    }
}

Upvotes: 1

Views: 1627

Answers (1)

mybirthname
mybirthname

Reputation: 18127

var node = doc.DocumentNode.SelectSingleNode(yourANode);
node.ParentNode.RemoveChild(node, true);

Something like this should help, this will remove Child of the parent node of your <a>, but it will keep grandChildren. This true parameter in RemoveChild indicates keepGrandChild.

If all <img> have <a>

var nodeList = doc.DocumentNode.SelectNodes("img");

for(HtmlNode node in nodeList)
{
    var parentATagNode = node.Parent.Parent;
    parentATagNode.RemoveChild(node.Parent, true);
}

Upvotes: 3

Related Questions