user2729272
user2729272

Reputation: 359

how to replace <br> tag with <br/> tag using HtmlAgilityPack?

Quite strange! When I load & replace with an empty string using

 var document = new HtmlDocument();
    document.LoadHtml(data); 
    document.DocumentNode.OuterHtml.Replace("<tbody>", "");

This works fine & <tbody> will be removed.

Same way when I try to replace <br> with <br/> using,

document.DocumentNode.OuterHtml.Replace("<br>", "<br/>");

It does not work :(

also tried,

 var brTags = document.DocumentNode.SelectNodes("//br");
            if (brTags != null)
            {
                foreach (HtmlNode brTag in brTags)
                {
                    brTag.OuterHtml = "<br/>";
                    // brTag.Name= "br/"; - > Also this one :(
                }
            }

HTMLAgilityPack's replace() function does not work for self closing tags?

Upvotes: 9

Views: 4598

Answers (4)

ragmn
ragmn

Reputation: 505

document.OptionWriteEmptyNodes = true;

Will do the trick for you!

Upvotes: 12

Simon Mourier
Simon Mourier

Reputation: 138915

You don't have to replace <br> by <br/> manually, if you need to close the node, just instruct the library to do so, for example this:

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml("<br/>");
doc.Save(Console.Out);

will output this:

<br>

and this

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml("<br/>");
doc.OptionWriteEmptyNodes = true;
doc.Save(Console.Out);

will output this:

<br />

Upvotes: 8

Sergey Berezovskiy
Sergey Berezovskiy

Reputation: 236218

Actually your first query also should not work if you do not assign result of replacement back to document. Strings are immutable in C#. When you do Replace new string is created and returned. Original string stays unchanged.

Also OuterHtml is read-only. You cannot assign it.

In order to remove nodes you should select them, remove each, and save result to original string.

var document = new HtmlDocument();
document.LoadHtml(data);
foreach (var tbody in document.DocumentNode.SelectNodes("//tbody"))
    tbody.Remove();
data = document.DocumentNode.OuterHtml;

UPDATE:

foreach (var br in document.DocumentNode.SelectNodes("//br"))
    br.RemoveAllChildren();

HtmlNode.ElementsFlags["br"] = HtmlElementFlag.Closed | HtmlElementFlag.Empty;
document.OptionWriteEmptyNodes = true;
data = document.DocumentNode.OuterHtml;

Upvotes: 1

EZI
EZI

Reputation: 15364

StringWriter writer = new StringWriter();
var xmlWriter = XmlWriter.Create(writer, new XmlWriterSettings() { OmitXmlDeclaration = true });
document.OptionOutputAsXml = true;

document.Save(xmlWriter);
var newHtml = writer.ToString();

Upvotes: 2

Related Questions