Reputation: 79
I have a sax parser with an xml tag that contains the following text: "A & amp; B" (There's no space there - added so it will not convert to & here)
It's as though it's getting converted twice and escaping due to ampersand with a result of "A". Here's the process:
Xml file is downloaded
InputStream _inputStream = _urlConnection.getInputStream();
BufferedInputStream _bufferedInputStream = new BufferedInputStream(_inputStream);
ByteArrayBuffer _byteArrayBuffer = new ByteArrayBuffer(64);
int current = 0;
while((current = _bufferedInputStream.read()) != -1)
{
_byteArrayBuffer.append((byte)current);
}
FileOutputStream _fileOutputStream = openFileOutput(_file, MODE_PRIVATE);
_fileOutputStream.write(_byteArrayBuffer.toByteArray());
_fileOutputStream.close();
Data is converted with Sax in the endElement
else if (inLocalName.equalsIgnoreCase(_nodeTitle))
{
_titleValue = currentValue;
currentValue = "";
}
In debug, the ampersand is already converted and the data truncated when I read it in my characters method in the handler.
I've seen a lot of questions about this but never a solution. Any ideas?
Thanks
Parser:
List<PropertiesList> _theList = null;
try
{
// Create Factory, Parser, Reader, Handler
SAXParserFactory _saxParserFactory = SAXParserFactory.newInstance();
SAXParser _saxParser = _saxParserFactory.newSAXParser();
XMLReader _xmlReader = _saxParser.getXMLReader();
HandlerReps _handler = new HandlerReps(inRegion, inAbbreviation);
_xmlReader.setContentHandler(_handler);
_xmlReader.parse(new InputSource(inStream));
_theList = _handler.getTheList();
}
Handler:
// Called when Tag Begins
@Override
public void startElement(String uri, String inLocalName, String inQName, Attributes inAttributes) throws SAXException
{
currentElement = false;
}
// Called when Tag Ends
@Override
public void endElement(String inUri, String inLocalName, String inQName) throws SAXException
{
currentElement = false;
// Title
if (inLocalName.equalsIgnoreCase(_nodeValue))
{
if (_stateValue.equalsIgnoreCase(_abbreviation) &&
_countryValue.equalsIgnoreCase(_region))
{
// Construct the object
PropertiesRegion _regionObject = new PropertiesRegion(_titleValue, _address1Value);
cList.add(_regionObject);
Log.d(TAG, _regionObject.toString());
}
_titleValue = "";
_address1Value = "";
}
// Title
else if (inLocalName.equalsIgnoreCase(_nodeTitle))
{
_titleValue = currentValue;
currentValue = "";
}
// Address1
else if (inLocalName.equalsIgnoreCase(_nodeAddress1))
{
_address1Value = currentValue;
currentValue = "";
}
}
// Called to get Tag Characters
@Override
public void characters(char[] inChar, int inStart, int inLength) throws SAXException
{
if (currentElement)
{
currentValue = new String(inChar, inStart, inLength);
currentElement = false;
}
}
Upvotes: 0
Views: 719
Reputation: 3597
This is very likely the cause of your problem:
if (currentElement)
{
currentValue = new String(inChar, inStart, inLength);
currentElement = false;
}
For each text content node, the SAX parser may send multiple characters() events to your handler. You only get the whole text if you concatenate all these events. But in your code, only the first of these events is used, because then you set currentElement = false
.
The problem is not ampersand conversion. As a general rule, when you describe a problem, it is often better to only describe the symptoms, not any supposed causes.
Upvotes: 1