Reputation: 430
I'd like to use XMLStreamReader for reading a XML file which contains Horizontal Tab ASCII Codes 	
, for example:
<tag>foo	bar</tag>
and print out or write it back to another xml file.
Google tells me to set javax.xml.stream.isCoalescing
to true
in XMLInputFactory
, but my test code below does not work as expected.
public static void main(String[] args) throws IOException, XMLStreamException {
XMLInputFactory factory = XMLInputFactory.newInstance();
factory.setProperty(factory.IS_COALESCING, true);
System.out.println("IS_COALESCING supported ? " + factory.isPropertySupported(factory.IS_COALESCING));
System.out.println("factory IS_COALESCING value is " +factory.getProperty(factory.IS_COALESCING));
String rawString = "<tag>foo	bar</tag>";
XMLStreamReader reader = factory.createXMLStreamReader(new StringReader(rawString));
System.out.println("reader IS_COALESCING value is " +reader.getProperty(factory.IS_COALESCING));
PrintWriter pw = new PrintWriter(System.out, true);
while (reader.hasNext())
{
reader.next();
pw.print(reader.getEventType());
if (reader.hasText())
pw.append(' ').append(reader.getText());
pw.println();
}
}
The output is
IS_COALESCING supported ? true
factory IS_COALESCING value is true
reader IS_COALESCING value is true
1
4 foo bar
2
8
But I want to keep the same Horizontal Tab like:
IS_COALESCING supported ? true
factory IS_COALESCING value is true
reader IS_COALESCING value is true
1
4 foo	bar
2
8
What am I missing here? thanks
Upvotes: 1
Views: 361
Reputation: 662
From what I see, the parsing part is correct - it's just not printed as you envision it. Your unicode encoding is interpreted by the XML reader as \t
and represented accordingly in Java.
Using Guava's XmlEscapers, I can produce something similar to what you want to have:
public class Test {
public static void main(String[] args) throws IOException, XMLStreamException {
XMLInputFactory factory = XMLInputFactory.newInstance();
factory.setProperty(XMLInputFactory.IS_COALESCING, true);
System.out.println("IS_COALESCING supported ? " + factory.isPropertySupported(XMLInputFactory.IS_COALESCING));
System.out.println("factory IS_COALESCING value is " + factory.getProperty(XMLInputFactory.IS_COALESCING));
String rawString = "<tag>foo	bar</tag>";
XMLStreamReader reader = factory.createXMLStreamReader(new StringReader(rawString));
System.out.println("reader IS_COALESCING value is " + reader.getProperty(XMLInputFactory.IS_COALESCING));
PrintWriter pw = new PrintWriter(System.out, true);
while (reader.hasNext()) {
reader.next();
pw.print(reader.getEventType());
if (reader.hasText()) {
pw.append(' ').append(XmlEscapers.xmlAttributeEscaper().escape(reader.getText()));
}
pw.println();
}
}
The Output looks like this:
IS_COALESCING supported ? true
factory IS_COALESCING value is true
reader IS_COALESCING value is true
1
4 foo	bar
2
8
Some remarks to this:
\t
does not need to be escaped in XML content, thus I had to choose the attribute converter. While it works, there might be some side effectsCDATA
?Upvotes: 1