Reputation: 5508
I have a component that creates XML documents from objects through a combination of XML serialization and XSL transformation; the resulting documents are handled as XDocument
objects. I use the XDocument.Save(TextWriter)
method to save documents to disk using UTF-8 encoding, like:
XDocument doc = this.CreateDocumentFrom(...);
using (Stream stream = File.OpenWrite(...))
{
var encoding = new UTF8Encoding(false);
var settings = new XmlWriterSettings { Encoding = encoding };
using (var writer = XmlWriter.Create(stream, settings))
{
doc.Save(writer);
}
}
Creating and writing the documents to disk works fine. Now, I´ve a requirement where text values within the XML must have a special encoding (only a small subset of the ASCII characters are allowed, let´s say upper- and lowercase letters except mutated vowels, digits and some special chars like comma, dot, ...). So, I thought I could simply inherit the UTF8Encoding
class and override some methods to achieve the wanted behaviour by just filtering invalid characters. I tried to override GetBytes(string)
and GetString(byte[])
, but it didn´t work. It seems that the XmlWriter does not use the given encoding instance at all.
This is what I tried...
public sealed class CustomEncoding : UTF8Encoding
{
private const string ValidChars = "abc...xyzABC...XYZ0...9";
public CustomEncoding() : base(false) { }
public override byte[] GetBytes(string s)
{
char[] characters = s.Where(x => ValidChars.Contains(x)).ToArray();
return base.GetBytes(characters);
}
...
}
In the end, I´ve overridden almost everything to figure out which methods of the Encoding class are called by the writer, but only an overload of GetCharCount(...)
is called when the XmlWriter.Create(Stream, XmlWriterSettings)
method is called. I got the feeling that I am on the wrong track...
Creating a derived class from XmlTextWriter
or XmlWriter
felt also wrong to me, because then I can´t use XmlWriter.Create(Stream, XmlWriterSettings)
any more, which is the recommended way to create XmlWriter instances.
Upvotes: 1
Views: 232
Reputation: 8190
If it were me, I'd scrub the data (presumably an instance of a class?) before calling the XmlWriter
. I might even create a derived class from the class you're serializing and then serialize that.
As an example:
public class SomeFoo
{
public string SomeTextValue {get; set;}
}
public class SomeDerivedFoo : SomeFoo
{
private SomeDerivedFoo();
public static SomeDerivedFoo CreateFromSomeFoo(SomeFoo someFoo)
{
base.SomeTextValue = //scrub your data here;
}
}
Then, in your XmlWriter, you serialize SomeDerivedFoo
AS SomeFoo
.
Or, for a similar effect without a new class, create a ScrubForSerialization()
method that will do the same thing on the original class.
Upvotes: 2