Reputation: 359
Quite strange! When I load & replace with an empty string using
var document = new HtmlDocument();
document.LoadHtml(data);
document.DocumentNode.OuterHtml.Replace("<tbody>", "");
This works fine & <tbody>
will be removed.
Same way when I try to replace <br>
with <br/>
using,
document.DocumentNode.OuterHtml.Replace("<br>", "<br/>");
It does not work :(
also tried,
var brTags = document.DocumentNode.SelectNodes("//br");
if (brTags != null)
{
foreach (HtmlNode brTag in brTags)
{
brTag.OuterHtml = "<br/>";
// brTag.Name= "br/"; - > Also this one :(
}
}
HTMLAgilityPack's replace() function does not work for self closing tags?
Upvotes: 9
Views: 4598
Reputation: 138915
You don't have to replace <br>
by <br/>
manually, if you need to close the node, just instruct the library to do so, for example this:
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml("<br/>");
doc.Save(Console.Out);
will output this:
<br>
and this
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml("<br/>");
doc.OptionWriteEmptyNodes = true;
doc.Save(Console.Out);
will output this:
<br />
Upvotes: 8
Reputation: 236218
Actually your first query also should not work if you do not assign result of replacement back to document. Strings are immutable in C#. When you do Replace
new string is created and returned. Original string stays unchanged.
Also OuterHtml
is read-only. You cannot assign it.
In order to remove nodes you should select them, remove each, and save result to original string.
var document = new HtmlDocument();
document.LoadHtml(data);
foreach (var tbody in document.DocumentNode.SelectNodes("//tbody"))
tbody.Remove();
data = document.DocumentNode.OuterHtml;
UPDATE:
foreach (var br in document.DocumentNode.SelectNodes("//br"))
br.RemoveAllChildren();
HtmlNode.ElementsFlags["br"] = HtmlElementFlag.Closed | HtmlElementFlag.Empty;
document.OptionWriteEmptyNodes = true;
data = document.DocumentNode.OuterHtml;
Upvotes: 1
Reputation: 15364
StringWriter writer = new StringWriter();
var xmlWriter = XmlWriter.Create(writer, new XmlWriterSettings() { OmitXmlDeclaration = true });
document.OptionOutputAsXml = true;
document.Save(xmlWriter);
var newHtml = writer.ToString();
Upvotes: 2