Reputation: 131
I have the following function that accepts an HTML string, for example "<p>áêö</p>"
:
public string EncodeString(string input)
{
// ...
return System.Net.WebUtility.HtmlEncode(input);
}
I'd like to modify that function to output the same string, but with the accented characters as HTML entities. Using System.Net.WebUtility.HtmlEncode()
encodes the entire string, including the HTML tags. I'd like to preserve the HTML tags if possible, since the string is parsed and rendered elsewhere in the application. Is this something that is better solved with a regex?
Upvotes: 0
Views: 891
Reputation: 621
You can use a library like AngleSharp to replace the content of an html element:
public static async Task<string> EncodeString(string input)
{
var context = BrowsingContext.New(Configuration.Default);
var document = await context.OpenAsync(req => req.Content(input));
var pElement = document.QuerySelector("p");
pElement.TextContent = System.Net.WebUtility.HtmlEncode(pElement.TextContent);
return pItem.ToHtml();
}
See it in action here: .NET Fiddle
For more general situations where you have nested elements, here's the adapted code:
public static async Task<string> EncodeString(string input)
{
var context = BrowsingContext.New(Configuration.Default);
var document = await context.OpenAsync(req => req.Content(input));
return await EncodeString(document.Body.FirstChild);
}
private static async Task<string> EncodeString(INode content)
{
foreach(var node in content.ChildNodes)
{
node.NodeValue = node.NodeType == NodeType.Text ?
System.Net.WebUtility.HtmlEncode(node.NodeValue) :
await EncodeString(node);
}
return content.ToHtml();
}
Upvotes: 1
Reputation: 437
This is quite possibly the oddest solution, but...
public static string EncodeString(string input)
{
string startTag = input.Substring(0, input.IndexOf(">") + 1);
string endTag = input.Substring(input.IndexOf("</"), startTag.Length + 1);
input = input.Substring(startTag.Length, input.Length - endTag.Length - startTag.Length);
return startTag + System.Net.WebUtility.HtmlEncode(input) + endTag;
}
Upvotes: 1