jaysonragasa
jaysonragasa

Reputation: 1076

How do I insert an HTML tag outside the selected tag?

I'm currently stock at the moment figuring out how to insert an HTML tag outside the selected tag.

What I'm loading on HtmlDocument is a text file which contains "some" HTML tags. It's not an HTML document that contains tag and but it's a text file with "some" HTML tags.

here's a sample content of the text file

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Maecenas vel risus id velit iaculis elementum egestas vel purus. Vestibulum ante ipsum primis in <a href="http://www.domain.com">this domain</a> faucibus orci luctus et ultrices posuere cubilia Curae; In lorem enim, dignissim id congue at, malesuada vitae sem. Phasellus et nibh venenatis, vulputate elit ut, consectetur tellus. 

Sed placerat ex et dolor lobortis convallis. Nulla tincidunt elementum elementum. Integer lacinia elementum orci, ac pretium lacus hendrerit eu. Donec vitae lorem leo. Curabitur placerat sagittis nisi eu posuere. Vestibulum eget felis nisi. Nunc vitae enim iaculis, <a href="http://www.domain.com">this domain</a> maximus justo ullamcorper, imperdiet felis. Vestibulum vestibulum sapien id diam dapibus pharetra. Pellentesque varius purus justo, a vehicula lectus semper at.

There are two A tags there and my xpath is just simple as "//a". My goal there is to decorate the A tag around with B, U, or I. The output would like

<b><u><a href="http://www.domain.com">this domain</a></u></b>

I was hoping HtmlNode.InsertBefore would help but what happened was like this

<a href="http://www.domain.com">this domain<b></b><u></u></a>

if HtmlNode.InsertAfter is used, it would look like this

<a href="http://www.domain.com"><b></b><u></u>this domain</a>

In both InsertBefore and InsertAftere, it requires a Reference Node, there is no reference node so I just set null

Here's the sample code

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(this.document);

HtmlNodeCollection nodcoll = doc.DocumentNode.SelectNodes("//a");
if (nodcoll != null)
{
    foreach (HtmlNode nod in nodcoll)
    {
        // nod.InsertAfter(HtmlNode.CreateNode(newhtml), null);
        // nod.InsertBefore(HtmlNode.CreateNode(newhtml), null);
    }           
}

update I forgot to mention that this is SEO's preference of formatting the A tag. If the formatting was inside the A tag, it will be so much easier

Upvotes: 0

Views: 163

Answers (1)

Prix
Prix

Reputation: 19528

Option A

One way you could achieve this would be by replacing the InnerHtml like this:

var nodeList = doc.DocumentNode.SelectNodes("//a");
if (nodeList != null && nodeList.Count > 0)
{
    foreach (var node in nodeList)
    {
        node.InnerHtml = "<b><u>" + node.InnerHtml + "</b></u>";
        // can also be written as:
        // node.InnerHtml = node.InnerHtml.Replace(node.InnerHtml, "<b><u>" + node.InnerHtml + "</b></u>");
    }
}

The above would produce:

<a href="http://www.domain.com"><b><u>this domain</u></b></a>

Option B

Another way by replacing the node would be:

var nodeList = doc.DocumentNode.SelectNodes("//a");
if (nodeList != null && nodeList.Count > 0)
{
    foreach (var node in nodeList)
    {
        // Reads the outer or you lose the link element
        var newNodeStr = "<b><u>" + node.OuterHtml + "</b></u>";
        // Replace the old node with our newly created one
        var newNode = HtmlNode.CreateNode(newNodeStr);
        node.ParentNode.ReplaceChild(newNode, node);
    }
}

The above would produce:

<b><u><a href="http://www.domain.com">this domain</a></u></b>

Upvotes: 1

Related Questions