XMLStreamReader doesn't read complete tag

Question

I'm parsing XML using XMLStreamReader. In tag there are some data loaded from database (WebRowSet object). The problem is that the content of this tag is very long (let's say several hundred kilobytes - the data are encoded in Base64), but input.getText() reads only 16.394 characters out of it.

I'm 100 % sure data coming to XMLStreamReader are OK.

I found some other answer here, but it doesn't solve my problem, I could of course use some other way how to read the data, but I would like to know what is the problem with this one.

Does somebody know how to get the whole content?

My code:

            input = xmlFactory.createXMLStreamReader(new ByteArrayInputStream(xmlData.getBytes("UTF-8")));

        while(input.hasNext()){
            if(input.getEventType() == XMLStreamConstants.START_ELEMENT){
                element = input.getName().getLocalPart();

                switch(element.toLowerCase()){
                    case "transactionresponse":
                        int transactionStatus = 0;

                        transactionResponse = new TransactionResponse(); 
                        for(int i=0; i

RealSkeptic · Accepted Answer

Event-driven XML parsers such as XMLStreamReader are designed to allow you to parse the XML without having to read it into memory all at one go, which is pretty essential in case you have a very large XML.

The design is such that it reads a certain buffer of data, and gives you events as it runs into "interesting" stuff, such as the beginning of a tag, the end of a tag, and so on.

But the buffer it reads is not infinite, as it is meant to handle large XML files, exactly like the one you have. For this reason, a large text in a tag may be represented by several consecutive CHARACTERS events.

That is, when you get a CHARACTERS event, there is no guarantee that it contains the whole text. If the text is too long for the reader's buffer, you will simply get more CHARACTERS events that follow.

Since you are only reading the data from the first CHARACTERS event, it is not the whole data.

The proper way to work with such a file is:

When you get a START_ELEMENT event for the element you are interested in, you make preparations for storing the text. For example, create a StringBuilder, or open a file for writing, etc.
For each CHARACTERS event that follows, you append the text to your storage (the StringBuilder, the file).
Once you get the END_ELEMENT event for the same element, you finish accumulating your data, and do whatever you need to do with it.

In fact, this is what the getElementText() method does for you - accumulates the data in a StringBuffer while going through CHARACTERS events until it hits the END_ELEMENT.

Bottom line: you only know you got the whole data when you hit the END_ELEMENT event. There is no guarantee that the text will be in a single CHARACTERS event.

XMLStreamReader doesn't read complete tag

Answers (2)

Related Questions

XMLStreamReader doesn&#39;t read complete tag

Answers (2)

Related Questions

XMLStreamReader doesn't read complete tag