Reputation: 189
I am trying to parse an XML file that looks like this:
<?xml version="1.0" encoding="utf-8"?>
<downloaddata>
<downloaditem itemid="1">
<title>Abdul kalaam Inspirational Talk</title>
<downloadlink>http://o-o.preferred.spectranet-blr1.v8.lscache4.c.youtube.com/videoplayback?upn=Rxb-DvFeBTE&sparams=cp%2Cid%2Cip%2Cipbits%2Citag%2Cratebypass%2Csource%2Cupn%2Cexpire&fexp=906512%2C907217%2C907335%2C921602%2C919306%2C919316%2C904455%2C919324%2C904452&itag=18&ip=203.0.0.0&signature=96D7FA17DF684B4C2CD30F12251F3263C83EC443.05F62E98E1059BB44459ABF319F50DC4B7E6D90E&sver=3&ratebypass=yes&source=youtube&expire=1337691481&key=yt1&ipbits=8&cp=U0hSTFZUT19NS0NOMl9OTlNFOmlwaTFSSGFfd3NK&id=67ffa1d50864f57d&title=Abdul%20Kalam%20inspirational%20Speech%20on%20Leadership%20and%20Motivation</downloadlink>
</downloaditem>
</downloaddata>
It seems that the parsing is failing when the data for the downloadlink
tag is as above. I have tried to replace the data with something else of the same length, and it works.
Below is the android code I am using.
import java.io.File;
import java.io.IOException;
import java.util.List;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
import android.os.Environment;
public class Wilxmlparser extends DefaultHandler{
List<VideoDetails> downloadList;
private String tempVal;
private VideoDetails tempVidDet;
public Wilxmlparser(){
}
public void parseXML() {
//get a factory
SAXParserFactory spf = SAXParserFactory.newInstance();
try {
//get a new instance of parser
SAXParser sp = spf.newSAXParser();
File downloadInfo =new File(Environment.getExternalStorageDirectory()+"/watchitlater/config/downloadinfo1.xml");
//parse the file and also register this class for call backs
sp.parse(downloadInfo, this);
}catch(SAXException se) {
se.printStackTrace();
}catch(ParserConfigurationException pce) {
pce.printStackTrace();
}catch (IOException ie) {
ie.printStackTrace();
}
}
//Event Handlers
@Override
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
//reset
tempVal = "";
if(qName.equalsIgnoreCase("downloaditem")) {
tempVidDet = new VideoDetails();
tempVidDet.setItemId(Integer.parseInt(attributes.getValue("itemid")));
}
}
@Override
public void characters(char[] ch, int start, int length) throws SAXException {
tempVal = new String(ch,start,length);
}
@Override
public void endElement(String uri, String localName, String qName) throws SAXException {
if(qName.equalsIgnoreCase("downloaditem")) {
downloadList.add(tempVidDet);
}else if (qName.equalsIgnoreCase("title")) {
tempVidDet.setTitle(tempVal);
}else if (qName.equalsIgnoreCase("downloadlink")) {
tempVidDet.setDownloadLink(tempVal);
}
}
}
The above code does not give a callback to endElement
for the above xml file.
however if the xml were to be like
<?xml version="1.0" encoding="utf-8"?>
<downloaddata>
<downloaditem itemid="1">
<title>Abdul kalaam Inspirational Talk</title>
<downloadlink>http://www.gmail.com/hello/world/sdfsdf%20.@@%!@# ($dwe</downloadlink>
</downloaditem>
</downloaddata>
or
<?xml version="1.0" encoding="utf-8"?>
<downloaddata>
<downloaditem itemid="1">
<title>Abdul kalaam Inspirational Talk</title>
<downloadlink>httphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttphttpa</downloadlink>
</downloaditem>
</downloaddata>
Then it works fine. What am I doing wrong?
Upvotes: 3
Views: 565
Reputation: 41137
The reason your parser cannot parse the xml in question is that it is invalid xml. The section of data that is causing your problem has characters that must be escaped. See Characters and escaping in the wikipedia article on XML for further info.
This is best corrected in whatever is producing the xml, and the easiest fix would be to wrap the offending text in a CDATA section.
Once the data is fixed, you may also see an issue caused by a misconception in your parsing code however.
@Override
public void characters(char[] ch, int start, int length) throws SAXException {
tempVal = new String(ch,start,length);
}
will not always get all the characters between start and end tags, as the contract for this method allows it to be called more than once. Instead of simply copying into a string, you need to append to a string buffer that is initialized in the startElement
method and used in the endElement
method.
See my answer to another SO question for a bit more on this characters
method parsing issue.
Upvotes: 1
Reputation: 809
Parser will not parse special charaters. You need to replace if all special chars present in the
Blockquote
Blockquote
you can pass this text to TextUtils.htmlEncode(string) and then start parsing. I think it will work or change it server side to give you data encoded with UTF-8 charset and on device side you can decode with same charset
Upvotes: 1