ewitkows
ewitkows

Reputation: 3618

XmlSerializer adding extra characters

I have a method that serializes an object to a string, exhibit a:

    Shared Function Serialize(ByVal o As Object) As String
        Dim rtnVal As String = ""
        Dim x As New System.Xml.Serialization.XmlSerializer(o.GetType())

        Using memStream As New MemoryStream
            Dim stWriter As New System.IO.StreamWriter(memStream)
            x.Serialize(stWriter, o)
            rtnVal = Encoding.UTF8.GetString(memStream.GetBuffer())
        End Using

        Return rtnVal
    End Function

Using this serialized data, I'm now inserting it into an XML typed field in my SQL 2012 database. Most of the time, this code works very well, but for a particular object, I'm getting "invalid" characters, namely the error "parsing line 5 character 17 illegal xml character". I took a look at my data, and it's clean, as you can see here:

<?xml version="1.0" encoding="utf-8"?>
    <RatingDetails xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <LenderName>dsfg</LenderName>
    <VehiclePrice>345</VehiclePrice>
</RatingDetails>

Some snooping led me do the IsXMLChar method - http://msdn.microsoft.com/en-us/library/system.xml.xmlconvert.isxmlchar%28v=vs.100%29.aspx - and using this I was able to loop through each character in my serialized XML string. Low and behold, I DO have invalid data. I have 15 "" character's at the end of my string - WTF!?!

So my questions to you all are, where the heck are the extra "'s coming from, why cant I see them when I inspect the string in my quick watch, and how do I prevent em in the first place.

Yours in ASP.NET, ewitkows

Upvotes: 2

Views: 1806

Answers (1)

Steven Doggart
Steven Doggart

Reputation: 43743

The problem is you are calling MemoryStream.GetBuffer. According to the MSDN article:

Note that the buffer contains allocated bytes which might be unused. For example, if the string "test" is written into the MemoryStream object, the length of the buffer returned from GetBuffer is 256, not 4, with 252 bytes unused. To obtain only the data in the buffer, use the ToArray method; however, ToArray creates a copy of the data in memory.

To fix it, you could call ToArray instead:

Shared Function Serialize(ByVal o As Object) As String
    Dim rtnVal As String = ""
    Dim x As New System.Xml.Serialization.XmlSerializer(o.GetType())
    Using memStream As New MemoryStream
        Dim stWriter As New System.IO.StreamWriter(memStream)
        x.Serialize(stWriter, o)
        rtnVal = Encoding.UTF8.GetString(memStream.ToArray())
    End Using
    Return rtnVal
End Function

However, that's still not really efficient. If the stream contains a lot of data, it's going to copy the whole thing into a new array for no reason. For peace of mind, I would recommend using the StreamReader to read the MemoryStream rather than trying to decode it yourself (but don't forget to seek back to the beginning of the stream before reading it):

Public Function Serialize(ByVal o As Object) As String
    Dim rtnVal As String = ""
    Dim x As New System.Xml.Serialization.XmlSerializer(o.GetType())
    Using memStream As New MemoryStream
        Dim stWriter As New System.IO.StreamWriter(memStream)
        x.Serialize(stWriter, o)
        Dim reader As New StreamReader(memStream)
        memStream.Position = 0  ' Seek to start of stream
        rtnVal = reader.ReadToEnd()
    End Using
    Return rtnVal
End Function

Upvotes: 5

Related Questions