Reputation: 31
I am having trouble parsing self closing XML tags using SAX. I am trying to extract the link tag from the Google Base API.I am having reasonable success in parsing regular tags.
Here is a snippet of the xml
<entry>
<id>http://www.google.com/base/feeds/snippets/15802191394735287303</id>
<published>2010-04-05T11:00:00.000Z</published>
<updated>2010-04-24T19:00:07.000Z</updated>
<category scheme='http://base.google.com/categories/itemtypes' term='Products'/>
<title type='text'>En-el1 Li-ion Battery+charger For Nikon Digital Camera</title>
<link rel='alternate' type='text/html' href='http://rover.ebay.com/rover/1/711-67261-24966-0/2?ipn=psmain&icep_vectorid=263602&kwid=1&mtid=691&crlp=1_263602&icep_item_id=170468125748&itemid=170468125748'/>
.
.
and so on
I can parse the updates and published tags, but not the link and category tag.
Here are my startElement and endElement overrides
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
if (qName.equals("title") && xmlTags.peek().equals("entry")) {
insideEntryTitle = true;
}
xmlTags.push(qName);
}
public void endElement(String uri, String localName, String qName)
throws SAXException {
// If a "title" element is closed, we start a new line, to prepare
// printing the new title.
xmlTags.pop();
if (insideEntryTitle) {
insideEntryTitle = false;
System.out.println();
}
}
declaration for xmltags..
private Stack<String> xmlTags = new Stack<String>();
Any help guys?
this is my first post here.. I hope I have followed posting rules! thanks a ton guys..
Correction: endElement
gets called. characters
does not.
public void characters(char[] ch, int start, int length) throws SAXException
{
if (insideEntryTitle)
{
String url= new String(ch, start, length);
System.out.println("url="+title);
i++;
}
}
Upvotes: 3
Views: 4216
Reputation: 96444
What characters
does is deliver the content between the XML element tags (in chunks, one chunk per method call). So
if you have an XML element like
<Foo someattrib=“” />
then the characters
doesn't get called, because there's no content there for the parser to tell you about.
If you are relying on your characters method having to get called here even if the tag is empty, you are doing it wrong.
The characters method adds element text to a buffer, but startElement and endElement need to be in charge of clearing and reading from the buffer because endElement is the place where you know you’ve received all the element text. It should be ok to have characters not get called if there is nothing to read.
Because you may not have all the content yet in any one characters call there must not be any business logic in that method. If there is then your code won’t work at some point.
For how to implement characters see this example. If what you want to do is read attribute values see this example.
Upvotes: 2