Simon Jensen
Simon Jensen

Reputation: 486

StAX not returning all characters in a string

I've been trying to create an XML controller class using StAX. My problem is that I am not getting the full string of an element, instead I get small parts of the string. (note some of the content have been hidden for security reasons, these will be displayed as {content})

Characters characters = event.asCharacters();
if (!characters.isWhiteSpace()) {
    System.out.println(characters.getData());
}

The above code does not return the full string.

What I expect to receive is:
{responseType} \([0-9]+\) ACC: [0-9]+,[0-9]+,[0-9]+,[0-9]+,[0-9]+,[0-9]+,[0-9]+,[0-9]+

what I get is the above string in 5 individual parts:
{responseType} \([0-9]+\) ACC: [0-9]
+,[0-9]+,[0-9]
+,[0-9]+,[0-9]
+,[0-9]+,[0-9]
+,[0-9]+

My code:

public static ArrayList<SmsCommand> readXML() {
    if (init()) {
        try {
            while (eventReader.hasNext()) {
                XMLEvent event = eventReader.nextEvent();
                switch (event.getEventType()) {
                case XMLStreamConstants.START_ELEMENT:
                    StartElement startElement = event.asStartElement();
                    String qName = startElement.getName().getLocalPart();
                    if (qName.equalsIgnoreCase("command")) {
                        Iterator<Attribute> attributes = startElement.getAttributes();
                        command = new SmsCommand(attributes.next().getValue());
                    }
                    break;
                case XMLStreamConstants.CHARACTERS:
                    Characters characters = event.asCharacters();
                    if (!characters.isWhiteSpace()) {
                        command.addResponse(characters.getData());
                    }
                    break;
                case XMLStreamConstants.END_ELEMENT:
                    EndElement endElement = event.asEndElement();
                    if (endElement.getName().getLocalPart().equalsIgnoreCase("command")) {
                        commands.add(command);
                    }
                    break;
                }
            }
        }
        catch (XMLStreamException e) {
            e.printStackTrace();
        }
    }
    return commands;
}

As well as my xml:

<?xml version="1.0" ?>
<root>
  <command type="{command}">
    <response>{responseType} \([0-9]+\) ACC: [0-9]+,[0-9]+,[0-9]+,[0-9]+,[0-9]+,[0-9]+,[0-9]+,[0-9]+</response>
  </command>
</root>

Upvotes: 3

Views: 1834

Answers (2)

Teddy
Teddy

Reputation: 4243

StAX parser is just splitting the characters into smaller pieces in the same sequential order. You could reconstruct it with a StringBuilder, while adding some length check for safety as well. Or you could just set a flag to get it combined by default.

By default a StAX parser will break (typically large) CHARACTER event into pieces to avoid creating large strings. You have no control over where this break occurs.

You can use a factory property “javax.xml.stream.isCoalescing” to control this behaviour and force it to combine adjacent CHARACTER events into a single event.

Reference:

Upvotes: 5

Martin Honnen
Martin Honnen

Reputation: 167641

You can call http://docs.oracle.com/javase/7/docs/api/javax/xml/stream/XMLEventReader.html#getElementText() when you know you have a text only XMLStreamConstants.START_ELEMENT.

Upvotes: 1

Related Questions