Lim Thye Chean
Lim Thye Chean

Reputation: 9484

Parsing XML file in Swift

I am trying to parse XML file using NSXMLParser. Everything seems to work fine initially but the content result seems to be truncated off and got some weird result.

func parser(parser: NSXMLParser!, didStartElement elementName: String!, namespaceURI: String!, qualifiedName qName: String!, attributes attributeDict: [NSObject : AnyObject]!) {
    if elementName == "title" {
        foundTitle = true
    }

    if elementName == "description" {
        foundDescription = true
    }
}

func parser(parser: NSXMLParser!, foundCharacters string: String!) {
    if (foundItem) {
        if foundTitle {
            println("Title: \(string)")
            foundTitle = false
        }
        else if foundDescription {
            println("Description: \(string)")
            foundDescription = false
        }
    }
}

The RSS feed I am testing on is This Day in Tech History (http://feedpress.me/ThisDayInTechHistory), and right now the first news have the following:

Title: IBM’s First Desktop Computer
Description: IBM introduces their System/23 Datamaster desktop computer...

Bur for my test result, this is what I got:

Title: IBM
Description: ’s First Desktop Computer
Description: July 28, 1981 IBM introduces their System/23 Datamaster desktop computer...

Note that the Title was truncated after the first ' and become a description! Is this a bug in NSXMLParser? Or what have I done wrong? Thanks!

Upvotes: 2

Views: 9646

Answers (3)

Lucas Cerro
Lucas Cerro

Reputation: 4494

Lim Thye Chean's answer is correct, but here's the problem in your code:

foundTitle = false

You see, foundCharacters stops at the first it encounters. Then you set foundTitle = false. So the remaining part of the string is being ignored when foundCharacters proceeds to find them (because foundTitle = false).

The best solution, IMHO, is to use these three delegate methods:

1) In didStartelement you should set a temporary variable such as var entryTitle = String() (so we're clearing out this string every time the parser didStartElement "title")

2) foundCharacters is called multiple times, stopping at many "uncommon" characters. We need to append each found string to our temporary variable. So inside foundCharacters we should say: entryTitle += string (to append to our variable all the little bits of string the parser finds separately)

3) Only when the parser didEndElement "title" should we assume that we have the "title" String completed. So it's here that we should say foundTitle = false, and also here that you should println(entryTitle)

I hope that helps. I've struggled a lot with the XMLParser, so I've written a short tutorial in understanding how it works: https://medium.com/@lucascerro/understanding-nsxmlparser-in-swift-xcode-6-3-1-7c96ff6c65bc

Upvotes: 2

Lim Thye Chean
Lim Thye Chean

Reputation: 9484

I found the issue. After getting the element "item", all the contained elements like "title" or "description" can appeared multiple times! So "IBM’s First Desktop Computer" will be split into 2 titles, and we need to combine them into some variables, and only construct the result when the element ends.

So new codes will work like this:

func parser(parser: NSXMLParser!, didStartElement elementName: String!, namespaceURI: String!, qualifiedName qName: String!, attributes attributeDict: [NSObject : AnyObject]!) {
    element = elementName

    if element == "item" {
        isItem = true
        titleText = ""
        ...
    }
}

// Get element text

func parser(parser: NSXMLParser!, foundCharacters string: String!) {
    if isItem {
        if element == "title" {
            titleText += string
        }

        ...
    }
}

// Construct HTML when element end

func parser(parser: NSXMLParser!, didEndElement elementName: String!, namespaceURI: String!, qualifiedName qName: String!) {
    if elementName == "item" {
        html += "<b>\(titleText)</b>"
        ...
    }
}

This works!

Upvotes: 1

jmduke
jmduke

Reputation: 1657

Your guess is correct! The NSXMLParser assumes that the string has already been escaped, and will run into issues with characters including >, <, ', &, and \.

To do a global replace on a string, you can use the NSString method stringByReplacingOccurrencesOfString, like so:

let xml = "<desciption>Here's a malformed XML string. Ain't it ugly?</description>"
xml.stringByReplacingOccurrencesOfString("'", withString: "&quot;")

Which returns:

"<desciption>Here&quot;s a malformed XML string. Ain&quot;t it ugly?</description>"

Upvotes: 2

Related Questions