Reputation: 73
I'm parsing XML using XMLStreamReader
. In <dbresponse>
tag there are some data loaded from database (WebRowSet
object). The problem is that the content of this tag is very long (let's say several hundred kilobytes - the data are encoded in Base64), but input.getText()
reads only 16.394 characters out of it.
I'm 100 % sure data coming to XMLStreamReader
are OK.
I found some other answer here, but it doesn't solve my problem, I could of course use some other way how to read the data, but I would like to know what is the problem with this one.
Does somebody know how to get the whole content?
My code:
input = xmlFactory.createXMLStreamReader(new ByteArrayInputStream(xmlData.getBytes("UTF-8")));
while(input.hasNext()){
if(input.getEventType() == XMLStreamConstants.START_ELEMENT){
element = input.getName().getLocalPart();
switch(element.toLowerCase()){
case "transactionresponse":
int transactionStatus = 0;
transactionResponse = new TransactionResponse();
for(int i=0; i<input.getAttributeCount(); i++){
switch(input.getAttributeLocalName(i)){
case "status": transactionStatus = TransactionResponse.getStatusFromName(input.getAttributeValue(i));
}
}
transactionResponse.setStatus(transactionStatus);
break;
case "dbresponse":
for(int i=0; i<input.getAttributeCount(); i++){
switch(input.getAttributeLocalName(i)){
case "request_id": id = Integer.parseInt(input.getAttributeValue(i)); break;
case "status": status = Response.getStatusFromName(input.getAttributeValue(i));
}
}
break;
}
}else if(input.getEventType() == XMLStreamConstants.CHARACTERS){
switch(element.toLowerCase()){
case "dbresponse":
String data = input.getText();
if(!data.equals("\n")){
data = new String(Base64.decode(data), "UTF-8");
}
Response response = new Response(data, status, id);
if(transactionResponse != null){
transactionResponse.addResponse(response);
}else{
this.addResponse(response);
}
id = -1;
status = -1;
break;
}
element = "";
}else if(input.getEventType() == XMLStreamConstants.END_ELEMENT){
switch(input.getLocalName().toLowerCase()){
case "transactionresponse": this.addTransactionResponse(transactionResponse); transactionResponse = null; break;
}
}
input.next();
Upvotes: 2
Views: 2289
Reputation: 34638
Event-driven XML parsers such as XMLStreamReader
are designed to allow you to parse the XML without having to read it into memory all at one go, which is pretty essential in case you have a very large XML.
The design is such that it reads a certain buffer of data, and gives you events as it runs into "interesting" stuff, such as the beginning of a tag, the end of a tag, and so on.
But the buffer it reads is not infinite, as it is meant to handle large XML files, exactly like the one you have. For this reason, a large text in a tag may be represented by several consecutive CHARACTERS
events.
That is, when you get a CHARACTERS
event, there is no guarantee that it contains the whole text. If the text is too long for the reader's buffer, you will simply get more CHARACTERS
events that follow.
Since you are only reading the data from the first CHARACTERS
event, it is not the whole data.
The proper way to work with such a file is:
START_ELEMENT
event for the element you are interested in, you make preparations for storing the text. For example, create a StringBuilder
, or open a file for writing, etc.CHARACTERS
event that follows, you append the text to your storage (the StringBuilder
, the file).END_ELEMENT
event for the same element, you finish accumulating your data, and do whatever you need to do with it.In fact, this is what the getElementText()
method does for you - accumulates the data in a StringBuffer
while going through CHARACTERS
events until it hits the END_ELEMENT
.
Bottom line: you only know you got the whole data when you hit the END_ELEMENT
event. There is no guarantee that the text will be in a single CHARACTERS
event.
Upvotes: 3
Reputation: 2110
I think the XMLStreamReader chunks the data, so maybe try looping getText() to concatenate all chunks ?
What about getElementText() method ?
Upvotes: 0