Michal Kordas
Michal Kordas

Reputation: 10925

Escape control characters in XML 1.0

I understand why control characters are illegal in XML 1.0, but still I need to store them somehow in XML payload and I cannot find any recommendations about escaping them. I cannot upgrade to XML 1.1.

How should I escape e.g. SOH character (\u0001 - standard separator for FIX messages)?

The following doesn't work:

<data>&#x01;</data>

Upvotes: 0

Views: 1232

Answers (3)

Jeff Brophy
Jeff Brophy

Reputation: 1

My company ended up adding our own markup before XML: {1}. You also have to escape the { and } braces as {123} and {125}. The when reading the XML you have to do your own parse of the embedded codes.

Upvotes: 0

Andy Lynch
Andy Lynch

Reputation: 1228

It's quite common in logging/printing of FIX messages to substitute SOH with another character like '|'. Could you do the same here?

Upvotes: 0

Michael Kay
Michael Kay

Reputation: 163488

One way is to use processing instructions: <?hex 01?>. But that only works in element content, not in attributes. And of course the processing instruction needs to be understood by the receiving application.

You could also use elements: <hex value="01"/> but elements are visible in an XSD schema or DTD, while processing instructions are hidden.

Another approach is that if a piece of payload can contain such characters, then put the whole payload in Base64 encoding.

Upvotes: 3

Related Questions