Reputation: 796
I'm trying to find H2
s in a string, and replace them to add an id
to each H2
.
var doc = new HtmlDocument();
doc.LoadHtml(blogsContent);
foreach (var node in doc.DocumentNode.SelectNodes("//h2"))
{
var testing = node.OuterHtml.Replace("<h2>", "<h2 id=\"" + node.InnerText + "\">");
// This does the job and changes the <h2> to a <h2 id="..."
}
var html = doc.DocumentNode.OuterHtml;
// However, here, the whole document after the foreach does not include any of the replacements.
How do I make var html
have all the changes which the foreach
should be implementing?
I've looked over StackOverflow, and can't really find an identical question which solves my issue. I apologise if I'm being daft.
Upvotes: 1
Views: 725
Reputation: 37720
Please try this:
var doc = new HtmlDocument();
doc.LoadHtml(@"
<div>
<h2>val 1</h2>
<h2>val 2</h2>
<h2>val 3</h2>
</div>
");
foreach (var node in doc.DocumentNode.SelectNodes("//h2"))
{
node.SetAttributeValue("id",node.InnerText);
// This does the job and changes the <h2> to a <h2 id="..."
}
var html = doc.DocumentNode.OuterHtml;
Console.WriteLine(html);
As I said in my comment, the purpose (and benefits) of HTMLAgilityPack is to avoid string manipulation.
Imagine some of you h2
contains forbidden characters like Why "><script>alert("bang")</script></h2> <h2> should be escaped ?
(especially true when allowing external user input, like comments in blog)
this would lead to :
<h2 id="Why "><script>alert("bang")</script> <h2> should be escaped ?">val 1</h2>
which is obviously a dangerous flaw (leads to XSS attacks)
Upvotes: 3