VirusX
VirusX

Reputation: 973

How to substitude/add root element in HtmlAgilityPack?

Assume I have following HTML code:

<p>Hello, bla-bla-bla</p>
<a href="somesite">Click here</a>

As you can see, it doesn't have html/body tags. What I want to do is to add html and body tags on top of the document.

I tried to add html tag with following code:

 var el = doc.CreateElement("html");
 var nodes = doc.DocumentNode.ChildNodes;
 doc.DocumentNode.RemoveAllChildren();
 el.AppendChildren(nodes);    
 doc.DocumentNode.AppendChild(el);  

But after that, a call doc.DocumentNode.WriteContentTo() returns <html></html>. If I change the execution order of last lines:

var el = doc.CreateElement("html");
var nodes = doc.DocumentNode.ChildNodes;
doc.DocumentNode.RemoveAllChildren();
doc.DocumentNode.AppendChild(el); 
el.AppendChildren(nodes);  

I got System.StackOverflowException after the doc.DocumentNode.WriteContentTo().

Possible solution can be something like this:

doc.LoadHtml("<html>" + doc.DocumentNode.WriteContentTo() + "</html>")

but it'll be ineffective due to full document reparsing.

Do you have any ideas, how this problem can be solved in performance-effective way?

Upvotes: 2

Views: 1682

Answers (1)

VirusX
VirusX

Reputation: 973

Finally, I got it to work. First sample wasn't working because doc.DocumentNode.ChildNodes returns not the copy of HtmlNodeCollection, but the node collection itself. It caused that all nodes from collection were deleted before adding them to el. The code below does the trick:

var el = doc.CreateElement("html");
var nodes = doc.DocumentNode.ChildNodes;
el.AppendChildren(nodes);    
doc.DocumentNode.RemoveAllChildren();
doc.DocumentNode.AppendChild(el);  

Upvotes: 3

Related Questions