PRosa
PRosa

Reputation: 1

HtmlAgilityPack remove node (thead) not working

Want to remove a complete thead (including th's). Why this doesn't work? I've tried other tags and nothing happens, result text is the same. like there were no changes.

<table>
   <thead>
      <tr>
        <th>Hora</th>
        <th>Estado</th>
        <th>Motivo</th>
         <th>Local</th>
         <th>Recetor</th>
       </tr>
     </thead>
 </table>

c# code

doc.LoadHtml("<table><thead><th>Hora</th><th>Estado</th><th>Motivo</th><th>Local</th><th>Recetor</th></thead></table>");

var nodes = doc.DocumentNode.SelectNodes("//thead").ToList();

foreach (var node in nodes) {
  node.Remove();
}

txtResults.Text=doc.Text;

Upvotes: 0

Views: 225

Answers (1)

Sergey Berezovskiy
Sergey Berezovskiy

Reputation: 236228

HtmlDocument.Text property has very unclear description:

The HtmlDocument Text. Careful if you modify it.

From observed behavior it looks like this property is not updated when you modify html document. So use doc.DocumentNode.OuterHtml instead.


Update: From ParsedText property implementation it looks like Text supposed to hold original unmodified parsed text:

public string ParsedText
{
    get { return Text; }
}

But this is not even a read-only property - it's a public field that can be modified anytime by anyone. So I would not trust the HtmlDocument.Text as its description says.

Upvotes: 1

Related Questions