Reputation: 1684
I want to replace ##
with ++
in an HTML
document (but just in text nodes).
I'm using HTML Agility Pack
to manipulate the document. This is my code:
private static void Main(string[] args)
{
var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml("<html><p>This is a test paragraph ##</p><a>Not here ##</a><div><p>Nested paragraph ##</p></div></html>");
Console.WriteLine(htmlDoc.Text);
GenerateLinksInHtmlNode(htmlDoc.DocumentNode.ChildNodes);
Console.WriteLine(htmlDoc.Text);
Console.ReadKey();
}
private static void GenerateLinksInHtmlNode(HtmlNodeCollection htmlNodeColl)
{
foreach (var childNode in htmlNodeColl)
{
switch (childNode.NodeType)
{
case HtmlNodeType.Document:
case HtmlNodeType.Element:
GenerateLinksInHtmlNode(childNode.ChildNodes);
break;
case HtmlNodeType.Text when childNode.ParentNode.Name == "a":
continue;
case HtmlNodeType.Text:
{
var txtNode = (HtmlTextNode) childNode;
txtNode.Text = GenerateLinks(txtNode.Text);
break;
}
}
}
}
private static string GenerateLinks(string txt)
{
return Regex.Replace(txt, "##", "++");
}
When I debug it, I can see that the text node has a replaced text, when it should be replaced. But in the second Console.WriteLine()
, the text is the same as in the first log.
Upvotes: 2
Views: 652
Reputation: 14231
The Text
property is set when the document is loaded. After that, it does not change. See source.
Use InnerHtml
or OuterHtml
property to see the changes.
Console.WriteLine(htmlDoc.DocumentNode.InnerHtml);
Upvotes: 2