thepoosh
thepoosh

Reputation: 12587

android XML parsers don't parse the complete link

So, I've been working on parsing a XML file getting from the internet (RSS).

I've been working according to IBM's parsers that can be found here.

unfortunately, when I try to parse the link that look like this:

http://www.website.net/index.php?option=com_adsmanager&page=display&catid=87&tid=208196

but my parsers only show the link as: http://www.website.net/index.php?option=, and the rest of the link is getting cut off.

any thoughts on how to fix this?

edit 1:

the SaxParser even doesn't work at all. it claims (incorrectly) that the document is not well formed, but I know its not true since it was checked and doubled checked.

edit 2:

the NodeList had more than one child and every ampersand (&) created a new node.

therefor, the code I had:

if (name.equalsIgnoreCase(LINK)) {
    val = property.getFirstChild().getNodeValue();
    message.setLink(val);
}

was not good. and so, I fixed it to this code:

if (name.equalsIgnoreCase(LINK)) {
    val = "";
    NodeList list = property.getChildNodes();
    for (int i = 0; i < list.getLength(); i++) {
        val += list.get(i).getNodeValue().toString();
    }
    message.setLink(val);
}

that was the way to do this in the DOM XML feed parser. now all I have to do is find out a way to do this within a different parser from the IBM examples.

Upvotes: 1

Views: 376

Answers (2)

thepoosh
thepoosh

Reputation: 12587

Well. I sort of solved this.

my second update was a correct look at the problem. the NodeList had more than one child and every ampersand (&) created a new node.

therefor, the code I had:

if (name.equalsIgnoreCase(LINK)) {
    val = property.getFirstChild().getNodeValue();
    message.setLink(val);
}

was not good. and so, I fixed it to this code:

if (name.equalsIgnoreCase(LINK)) {
    val = "";
    NodeList list = property.getChildNodes();
    for (int i = 0; i < list.getLength(); i++) {
        val += list.get(i).getNodeValue().toString();
    }
    message.setLink(val);
}

that was the way to do this in the DOM XML feed parser

Upvotes: 0

Anders Lindahl
Anders Lindahl

Reputation: 42870

<link>http://www.website.net/index.php?option=com_adsmanager&page=display&catid‌​=87&tid=208196</link> 

...is not valid XML, since the &s are not followed by valid xml entities.

There are a couple of ways to work around this:

Escape the &s:

<link>http://www.website.net/index.php?option=com_adsmanager&amp;page=display&amp;catid‌​=87&amp;tid=208196</link> 

Wrap the link section in CDATA

<link><![CDATA[http://www.website.net/index.php?option=com_adsmanager&page=display&catid‌​=87&tid=208196]]></link> 

If you are not in control of the RSS file creation, you will have to pre-process the document before feeding it to an XML parser. Move forgiving xml parsers like TagSoup might be helpful.

Upvotes: 1

Related Questions