Reputation: 890
I am trying to parse an xml file (link below) to get all the text, which have tags. I am able to do this, but the parser is ignoring a line ending with a dash (-) and replacing it with just the dash (see example below), and some with a talking mark the talking mark is ending up on a new blank line. What could be causing this (i.e text encoding issues?, incorrectly parsing).
This is the file:
http://www.perseus.tufts.edu/hopper/xmlchunk?doc=Perseus%3Atext%3A1999.02.0055%3Abook%3D1
I am using code like this to get the content:
- (void) parser:(NSXMLParser *)parser didEndElement:(NSString *)elementname namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName
{
if ([elementname isEqualToString:@"l"]) {
NSString *textSoFar = [[NSUserDefaults standardUserDefaults] stringForKey:@"litText"];
textSoFar = [[NSString alloc] initWithFormat:@"%@\n%@", textSoFar, currentNodeContent];
[[NSUserDefaults standardUserDefaults] setObject:textSoFar forKey:@"litText"];
}
}
An example of a problem line is near the start, it should be:
Id metuens, veterisque memor Saturnia belli,
prima quod ad Troiam pro caris gesserat Argis—
necdum etiam causae irarum saevique dolores
But it is coming up as:
Id metuens, veterisque memor Saturnia belli,
—
necdum etiam causae irarum saevique dolores
Let me know if you need any more help understanding my question, thanks for the help in advance.
Also, here is my parser:found characters code, I commented out currentNodeContent and it still does not work.:
- (void) parser:(NSXMLParser *)parser foundCharacters:(NSString *)string
{
//currentNodeContent = (NSMutableString *) [string stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
}
Upvotes: 0
Views: 124
Reputation: 626
In Your foundCharacters method You probably set the currentNodeContent. You should append it, because it can get called many times per node.
Also see this question: NSXMLParser retrieving wrong data from XML tags
You should have something like this:
In Your didStartElement function:
currentNodeContent = [[NSMutableString alloc] init];
And in Your foundCharacters function:
[currentNodeContent appendString:string];
Then it will work
Upvotes: 2