Jack Solomon
Jack Solomon

Reputation: 890

NSXMLParser not working on some tags

I am trying to parse an xml file (link below) to get all the text, which have tags. I am able to do this, but the parser is ignoring a line ending with a dash (-) and replacing it with just the dash (see example below), and some with a talking mark the talking mark is ending up on a new blank line. What could be causing this (i.e text encoding issues?, incorrectly parsing).

This is the file:

http://www.perseus.tufts.edu/hopper/xmlchunk?doc=Perseus%3Atext%3A1999.02.0055%3Abook%3D1

I am using code like this to get the content:

   - (void) parser:(NSXMLParser *)parser didEndElement:(NSString *)elementname namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName
    {
    if ([elementname isEqualToString:@"l"]) {
        NSString *textSoFar = [[NSUserDefaults standardUserDefaults] stringForKey:@"litText"];
        textSoFar = [[NSString alloc] initWithFormat:@"%@\n%@", textSoFar, currentNodeContent];
        [[NSUserDefaults standardUserDefaults] setObject:textSoFar forKey:@"litText"];
    }
    }

An example of a problem line is near the start, it should be:

Id metuens, veterisque memor Saturnia belli,
prima quod ad Troiam pro caris gesserat Argis—
necdum etiam causae irarum saevique dolores

But it is coming up as:

Id metuens, veterisque memor Saturnia belli,
—
necdum etiam causae irarum saevique dolores

Let me know if you need any more help understanding my question, thanks for the help in advance.

Also, here is my parser:found characters code, I commented out currentNodeContent and it still does not work.:

- (void) parser:(NSXMLParser *)parser foundCharacters:(NSString *)string
{
    //currentNodeContent = (NSMutableString *) [string stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
}

Upvotes: 0

Views: 124

Answers (1)

In Your foundCharacters method You probably set the currentNodeContent. You should append it, because it can get called many times per node.

Also see this question: NSXMLParser retrieving wrong data from XML tags

You should have something like this:

In Your didStartElement function:

currentNodeContent = [[NSMutableString alloc] init];

And in Your foundCharacters function:

[currentNodeContent appendString:string];

Then it will work

Upvotes: 2

Related Questions