Reputation: 41
How can i remove all unknown existings custom tags keeping html content in this following example :
<div>
<h1>my header</h1>
<custom:p>
<h2>my Title</h2>
</custom:p>
<anothercustom:p>
<h3>my SubTitle</h3>
</anothercustom:p>
</div>
I would like to return
<div>
<h1>my header</h1>
<h2>my Title</h2>
<h3>my SubTitle</h3>
</div>
Is there any solution with HTML sanitizer ?
Thanks for your help.
Upvotes: 4
Views: 5752
Reputation: 2841
I've been looking for the same thing. I found that HtmlSanitizer
has a KeepChildNodes
option in version 3.4.156, which I'm using, that does exactly this.
var sanitizer = new HtmlSanitizer();
sanitizer.KeepChildNodes = true;
sanitizer.Sanitize(html);
Upvotes: 4
Reputation: 1488
You can use the HtmlSanitizer.RemovingTag
event to keep the contents of the tag:
var sanitizer = new HtmlSanitizer();
sanitizer.RemovingTag += (sender, args) =>
{
args.Tag.OuterHtml = sanitizer.Sanitize(args.Tag.InnerHtml);
args.Cancel = true;
};
var sanitized = sanitizer.Sanitize("<unknown>this will not be removed</unknown>");
Upvotes: 2
Reputation: 153
Assuming you are using the htmlSanitizer for .net on git hub https://github.com/mganss/HtmlSanitizer
You can modify the opensource project to preserve the content of the tags Change the RemoveTag method of the HtmlSanitize class as follows:
/// <summary>
/// Remove a tag from the document.
/// </summary>
/// <param name="tag">to be removed</param>
private void RemoveTag(IDomObject tag)
{
var e = new RemovingTagEventArgs { Tag = tag };
OnRemovingTag(e);
if (!e.Cancel)
{
// tag.Remove();<<remove this;
//replace it with this vvvvvvvvvvvvvv
tag.OuterHTML = this.Sanitize(tag.InnerHTML);
}
}
Upvotes: 0