Reputation: 354
Is it possible to save the ASCII NUL character in XML like this <data>*NUL**NUL**NUL*</data>
?
I know I can display this value in Java using System.out.println("\0")
and I wonder if XML can handle this value.
*My objective is to get "\0\0\0" from XML using Java
Thank you in advance!
Upvotes: 5
Views: 7278
Reputation: 7853
NUL(U+0000) is not allowed in XML 1.0 and 1.1.
Wikipedia: Valid characters in XML
Note that the code point U+0000, assigned to the null control character, is the only character encoded in Unicode and ISO/IEC 10646 that is always invalid in any XML 1.0 and 1.1 document.
Upvotes: 3
Reputation: 109532
By the specs for 1.0 it would not be allowed officially.
The ASCII NUL aka '\0'
aka \u0000
is a normal character in java. In C/C++ however it is used as a string terminator. So when C software would process XML it probably would detect the end of the XML text way too early.
For this java also has a solution, namely when XML is written in the UTF-8 encoding Unicode values > 127 are encoded in a multibyte sequence with 8th bit 1. DataOutputStream.writeUTF8
writes the '\0` also as multi-byte sequence. So it is read normally, and the decoding works.
So it is not a good idea.
Also mind, binary data should be converted to Base64 ASCII instead. As UTF-8 is not suited for binary data.
Upvotes: 3
Reputation: 3296
I have not read the standard of XML but since ElementTree of Python complains that it is not a valid XML-character, I think it is not supported by XML. You could implement an escape mechanism and represent "\0"
with "\\0"
. Another possibility is the use the common Base64 encoding.
In Java, it may look like this:
// write data to element
String data = ...
element.setText(Base64.getEncoder().encodeToString(data.getBytes("UTF-8")))
// read data from element
String data = new String(Base64.getDecoder().decode(element.getText())), "UTF-8")
Upvotes: 2
Reputation: 11773
These are the possibilities for what data might look like,
<row>
<data>actual data</data>
</row>
<row>
<!--null using attr. n ="t"-->
<data n="t"></data>
</row>
<row>
<!--some other meaning-->
<data/>
</row>
edit: If you want to represent multiple nulls take the attribute route and change the attribute to represent how many nulls.
<row>
<!--null using attr. n ="3"-->
<data n="3"></data>
</row>
which is three nulls in the example.
edit: This is valid XML
<row>
<data>\0</data>
</row>
Your XML processor may not like it, but there is nothing wrong with it.
Upvotes: 2