JakubW
JakubW

Reputation: 1121

XmlPullParser unclosed tag ignore

The problem I have is that XML data that my application is receiving is kind of corrupted. And because I cannot do anything about that I need to find a workaround.

This is how corrupted part looks like:

<line> I like cookies <u>Do you like them too?</u> </line>

I there any way to force XmlPullParser to ignore the u & /u?

For now I am able to read the first part of LINE string but I also need the rest of it.

Or is there any way to read u & /u as normal string instead of tags?

Thanks for help!

Upvotes: 1

Views: 439

Answers (2)

Ramp
Ramp

Reputation: 1772

You can have some logic in your parsing to extract the test from the XML by ignoring the tags that you don't want. For the example you have provided, you can do something like below to capture ALL the text between line tag irrespective of what tags are in them :

XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
    factory.setNamespaceAware(true);
    XmlPullParser xpp = factory.newPullParser();
    boolean inLineTag = false;
    StringBuilder strBldr = new StringBuilder();
    xpp.setInput(new StringReader(
            "<line> I like cookies <u>Do you like them too?</u> </line>"));
    int eventType = xpp.getEventType();
    while (eventType != XmlPullParser.END_DOCUMENT) {
        if (eventType == XmlPullParser.START_TAG) {
            if(("line").equals(xpp.getName())){
                inLineTag = true;
            }
        } else if (eventType == XmlPullParser.END_TAG) {
            if(("line").equals(xpp.getName())){
                inLineTag = false;
            }
        } else if (eventType == XmlPullParser.TEXT) {
            if (inLineTag) {
                strBldr.append(xpp.getText());
            } 
        }
        eventType = xpp.next();
    }

    System.out.println("Text " + strBldr.toString());
}

Hope that helps!

Upvotes: 2

Naveen Ramawat
Naveen Ramawat

Reputation: 1445

It would be better to ask your provider to send such data in CDATA tag so you could be able to parse it as a single string
example
<line>I like cookies <u>Do you like them too?</u><![CDATA[<sender>John Smith</sender>]]> </line>

Upvotes: 0

Related Questions