Reputation: 1418
I'm trying to parse a simil-InkML document. Every content's node has more tuple (separated by comma) with 6 or 7 number (negative and decimal too).
In testing I see that the method character of SAX don't memorize all the data.
The code:
public class PenParser extends DefaultHandler {
//code useless
public void characters(char ch[], int start, int length) throws SAXException {
//begin my debug print
StringBuilder buffer=new StringBuilder ();
for(int i=start;i<length;i++){
buffer.append(ch[i]);
}
System.out.println(">"+buffer);
//end my debug print
In debug, I see that buffer don't contain all the number of the interested tag, but it contain only the first 107 (more or less) char of content of the tag (my rows are not longer that 4610 char): it's strange this cut of char by StringBuffer and SAX parsing, in my opinion.
I had used the StringBuilder too but the problem remain.
Any suggest?
Upvotes: 1
Views: 3068
Reputation: 2575
Yes - that's pretty obvious. characters may be called several times when one node is parsed.
You'll have to use the StringBuilder
as member, append the content in characters and deal with the content in endElement
.
edited
btw. you do not need to build the buffer character by character - this is my implementation of characters (which I always use)
@Override
public void characters(char[] ch, int start, int length) throws SAXException
{
characters.append(new String(ch,start,length));
}
... and not to forget ....
@Override
public void endElement(String uri, String localName, String qName)
throws SAXException
{
final String content = characters.toString().trim();
// .... deal with content
// reset characters
characters.setLength(0);
}
private final StringBuilder characters = new StringBuilder(64);
Upvotes: 9