MattHodson
MattHodson

Reputation: 796

How to replace a value with HTMLAgilityPack

I'm trying to find H2s in a string, and replace them to add an id to each H2.

var doc = new HtmlDocument();
doc.LoadHtml(blogsContent);
foreach (var node in doc.DocumentNode.SelectNodes("//h2"))
{
   var testing = node.OuterHtml.Replace("<h2>", "<h2 id=\"" + node.InnerText + "\">");
   // This does the job and changes the <h2> to a <h2 id="..."
}
var html = doc.DocumentNode.OuterHtml; 
// However, here, the whole document after the foreach does not include any of the replacements. 

How do I make var html have all the changes which the foreach should be implementing?

I've looked over StackOverflow, and can't really find an identical question which solves my issue. I apologise if I'm being daft.

Upvotes: 1

Views: 725

Answers (1)

Steve B
Steve B

Reputation: 37720

Please try this:

    var doc = new HtmlDocument();
    doc.LoadHtml(@"
    <div>
        <h2>val 1</h2>
        <h2>val 2</h2>
        <h2>val 3</h2>
    </div>  
    ");
    foreach (var node in doc.DocumentNode.SelectNodes("//h2"))
    {
       node.SetAttributeValue("id",node.InnerText);
       // This does the job and changes the <h2> to a <h2 id="..."
    }
    var html = doc.DocumentNode.OuterHtml; 
    Console.WriteLine(html);

As I said in my comment, the purpose (and benefits) of HTMLAgilityPack is to avoid string manipulation.

Imagine some of you h2 contains forbidden characters like Why "><script>alert("bang")</script></h2> <h2> should be escaped ? (especially true when allowing external user input, like comments in blog)

this would lead to :

<h2 id="Why "><script>alert("bang")</script> <h2> should be escaped ?">val 1</h2>

which is obviously a dangerous flaw (leads to XSS attacks)

Upvotes: 3

Related Questions