Alexandre Rondeau
Alexandre Rondeau

Reputation: 2687

XML Unicode Safe Encoding

I'm looking for a way to encode an XML document using the #&233; encoding.

Using this basic code

var xmlDoc = new XmlDocument();
xmlDoc.Load(@"D:\Temp\XmlDocBase.xml");
xmlDoc.Save(@"D:\Temp\XmlDocBaseCopy.xml");

my Xml document pass from:

<?xml version="1.0"?>
<Tag1>
  <comment>entit&#233;</comment>
</Tag1>

to

<?xml version="1.0"?>
<Tag1>
  <comment>entité</comment>
</Tag1>

Regards

Upvotes: 4

Views: 8259

Answers (2)

Gilles
Gilles

Reputation: 5407

You can call HttpUtility.HtmlEncode on a string.

return HttpUtility.HtmlEncode("entité");

Returns entit&#233;

HttpUtility is part of System.Web.

Upvotes: 2

Alexei Levenkov
Alexei Levenkov

Reputation: 100547

You can force encoding that does not support all unicode characters (i.e. ASCII). As result writer will be forced to use entities.

    XmlDocument doc = new XmlDocument();
    doc.LoadXml("<Tag1><comment>entit&#233;</comment></Tag1>");

    var writer = XmlTextWriter.Create(
        @"c:\temp\o.xml",
        new XmlWriterSettings { Encoding = System.Text.ASCIIEncoding.ASCII });
    doc.Save(writer);

Results in:

<?xml version="1.0" encoding="us-ascii"?><Tag1><comment>entit&#xE9;</comment></Tag1>

Upvotes: 5

Related Questions