ragmn
ragmn

Reputation: 505

How to delete a html tag alone but not inner html or children tag using HTMLAgilityPack?

I need to delete an html tag say <tbody> in the following code,

<TABLE>
  <TBODY>
  <TR>    
    <TD></TD>
    <TD></TD>
    <TD></TD></TR>
  <TR>    
    <TD valign="bottom"></TD>
    <TD valign="bottom"></TD>
    <TD valign="bottom"></TD></TR>
  </TBODY>
</TABLE>

I'm using,

      var document = new HtmlDocument();
      document.LoadHtml(<URL>);
      if (document.DocumentNode.SelectSingleNode("//tbody") != null)
                {
                    document.DocumentNode.SelectSingleNode("//tbody").Remove();
                }

But its deleting the entire block instead of just alone :(

Appreciate your help & time :)

Upvotes: 1

Views: 946

Answers (3)

It Grunt
It Grunt

Reputation: 3388

If you give your tags an id, you should be able to access the element by id. This will make it super easy to delete.

Upvotes: 0

I4V
I4V

Reputation: 35373

var tbody = document.DocumentNode.SelectSingleNode("//tbody");
tbody.ParentNode.RemoveChild(tbody, keepGrandChildren: true);

OUTPUT:

<table>

  <tr>    
    <td valign="bottom"></td>
    <td valign="bottom"></td>
    <td valign="bottom"></td></tr>
  <tr>    
    <td></td>
    <td></td>
    <td></td></tr>

</table>

Upvotes: 4

xyzzy.rad
xyzzy.rad

Reputation: 31

The inner html is an integral part of the tag, that's why the inner html is also getting deleted.

What you need to do is replace the <tbody> tag by the inner html of <tbody>, in your case, something like this (i did not check if this code works, but you get the idea):

document.DocumentNode.SelectSingleNode("//table").innerHTML = document.DocumentNode.SelectSingleNode("//tbody").innerHTML;

Upvotes: 1

Related Questions