Reputation: 199
When I urlEncode a string (namely a xml file) in some ocassions it adds %00 character at the end of the file. I'd like to know why it happens this and if it can be prevented (i can always erase the %00 characters). The xml file was created using xmlwriter. Weird thing is I use the same code to create other xml files and after encoding them it doesn't add %00 characters.
Example:
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE peticion >
<peticion>
<nombre>Info hotel</nombre>
<agencia>HOTUSA</agencia>
<tipo>15</tipo>
</peticion>
Edit: to create the xml this is what I do.
Dim xmlWriterSettings As New System.Xml.XmlWriterSettings
With xmlWriterSettings
.Encoding = Encoding.GetEncoding("iso-8859-1")
.OmitXmlDeclaration = False
.Indent = True
End With
Dim ms As New IO.MemoryStream
Using writer As System.Xml.XmlWriter = System.Xml.XmlWriter.Create(ms, xmlWriterSettings)
With writer
.WriteDocType("peticion", Nothing, Nothing, Nothing)
.WriteStartElement("peticion")
.WriteElementString("nombre", "Info hotel")
.WriteElementString("agencia", "HOTUSA")
.WriteElementString("tipo", "15")
.WriteEndElement()
End With
End Using
Dim xml As String = Encoding.GetEncoding("iso-8859-1").GetString(ms.GetBuffer)
Dim XmlEncoded As String = HttpUtility.UrlEncode(xml)
XmlEncoded contains:
%3c%3fxml+version%3d%221.0%22+encoding%3d%22iso-8859-1%22%3f%3e%0d%0a%3c!DOCTYPE+peticion+%3e%0d%
0a%3cpeticion%3e%0d%0a++%3cnombre%3eInfo+hotel%3c%2fnombre%3e%0d%0a++%3cagencia%3eHOTUSA%3c%
2fagencia%3e%0d%0a++%3ctipo%3e15%3c%2ftipo%3e%0d%0a%3c%2fpeticion%3e%00%00%00%00%00%00%00%00%00%
00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%
00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%
00%00%00%00%00%00%00%00%00%00%00%00%00%00
Where all these %00 come from?
Upvotes: 3
Views: 4194
Reputation: 64068
The remarks on MemoryStream.GetBuffer
provide the appropriate guidance:
Note that the buffer contains allocated bytes which might be unused. For example, if the string "test" is written into the MemoryStream object, the length of the buffer returned from GetBuffer is 256, not 4, with 252 bytes unused. To obtain only the data in the buffer, use the ToArray method; however, ToArray creates a copy of the data in memory.
Modify your code like so:
Dim xml As String = Encoding.GetEncoding("iso-8859-1").GetString(ms.ToArray)
In fact, a better option in this case would be to use a StringBuilder
:
Dim sb As New StringBuilder
Using writer As XmlWriter = XmlWriter.Create(sb, xmlWriterSettings)
' ...
End Using
Dim xml as String = sb.ToString()
Upvotes: 4
Reputation: 4112
I believe that ms.GetBuffer
contains more than you think. %00
represents a NULL and my guess is that the buffer contains filler NULLs at the end.
Instead do:
Using ms As New IO.MemoryStream
Dim writer As System.Xml.XmlWriter = System.Xml.XmlWriter.Create(ms, xmlWriterSettings)
With writer
.WriteDocType("peticion", Nothing, Nothing, Nothing)
.WriteStartElement("peticion")
.WriteElementString("nombre", "Info hotel")
.WriteElementString("agencia", "HOTUSA")
.WriteElementString("tipo", "15")
.WriteEndElement()
End With
ms.Position = 0
Dim xml As String = ms.ReadToEnd()
Dim XmlEncoded As String = HttpUtility.UrlEncode(xml)
End Using
See this question for more info on getting a string from a MemoryStream
.
See this documentation detailing the fact that the buffer contains allocated bytes which might be unused.
Upvotes: 1