Reputation: 1076
I'm currently stock at the moment figuring out how to insert an HTML tag outside the selected tag.
What I'm loading on HtmlDocument is a text file which contains "some" HTML tags. It's not an HTML document that contains tag and but it's a text file with "some" HTML tags.
here's a sample content of the text file
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Maecenas vel risus id velit iaculis elementum egestas vel purus. Vestibulum ante ipsum primis in <a href="http://www.domain.com">this domain</a> faucibus orci luctus et ultrices posuere cubilia Curae; In lorem enim, dignissim id congue at, malesuada vitae sem. Phasellus et nibh venenatis, vulputate elit ut, consectetur tellus.
Sed placerat ex et dolor lobortis convallis. Nulla tincidunt elementum elementum. Integer lacinia elementum orci, ac pretium lacus hendrerit eu. Donec vitae lorem leo. Curabitur placerat sagittis nisi eu posuere. Vestibulum eget felis nisi. Nunc vitae enim iaculis, <a href="http://www.domain.com">this domain</a> maximus justo ullamcorper, imperdiet felis. Vestibulum vestibulum sapien id diam dapibus pharetra. Pellentesque varius purus justo, a vehicula lectus semper at.
There are two A tags there and my xpath is just simple as "//a". My goal there is to decorate the A tag around with B, U, or I. The output would like
<b><u><a href="http://www.domain.com">this domain</a></u></b>
I was hoping HtmlNode.InsertBefore would help but what happened was like this
<a href="http://www.domain.com">this domain<b></b><u></u></a>
if HtmlNode.InsertAfter is used, it would look like this
<a href="http://www.domain.com"><b></b><u></u>this domain</a>
In both InsertBefore and InsertAftere, it requires a Reference Node, there is no reference node so I just set null
Here's the sample code
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(this.document);
HtmlNodeCollection nodcoll = doc.DocumentNode.SelectNodes("//a");
if (nodcoll != null)
{
foreach (HtmlNode nod in nodcoll)
{
// nod.InsertAfter(HtmlNode.CreateNode(newhtml), null);
// nod.InsertBefore(HtmlNode.CreateNode(newhtml), null);
}
}
update I forgot to mention that this is SEO's preference of formatting the A tag. If the formatting was inside the A tag, it will be so much easier
Upvotes: 0
Views: 163
Reputation: 19528
One way you could achieve this would be by replacing the InnerHtml
like this:
var nodeList = doc.DocumentNode.SelectNodes("//a");
if (nodeList != null && nodeList.Count > 0)
{
foreach (var node in nodeList)
{
node.InnerHtml = "<b><u>" + node.InnerHtml + "</b></u>";
// can also be written as:
// node.InnerHtml = node.InnerHtml.Replace(node.InnerHtml, "<b><u>" + node.InnerHtml + "</b></u>");
}
}
The above would produce:
<a href="http://www.domain.com"><b><u>this domain</u></b></a>
Another way by replacing the node would be:
var nodeList = doc.DocumentNode.SelectNodes("//a");
if (nodeList != null && nodeList.Count > 0)
{
foreach (var node in nodeList)
{
// Reads the outer or you lose the link element
var newNodeStr = "<b><u>" + node.OuterHtml + "</b></u>";
// Replace the old node with our newly created one
var newNode = HtmlNode.CreateNode(newNodeStr);
node.ParentNode.ReplaceChild(newNode, node);
}
}
The above would produce:
<b><u><a href="http://www.domain.com">this domain</a></u></b>
Upvotes: 1