AmiiQo
AmiiQo

Reputation: 353

Parsing XML files with special characters

I try to parse a list of persons and pollute a UITableView with the names. But the persons I want to parse have special character (ä, ö, ü). Now if I start parsing the name "Gött" it is "ött" afterwards. Really strange, any ideas? Thanks a lot!

-(id) loadXMLByURL:(NSString *)urlString
{
    tweets          = [[NSMutableArray alloc] init];
    NSURL *url      = [NSURL URLWithString:urlString];
    NSData  *data   = [[NSData alloc] initWithContentsOfURL:url];
    parser          = [[NSXMLParser alloc] initWithData:data];
    parser.delegate = self;
    [parser parse];
    return self;
}

- (void) parser:(NSXMLParser *)parser didStartElement:(NSString *)elementname namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict
{
    if ([elementname isEqualToString:@"lehrer"]) 
    {
        currentTweet = [Tweet alloc];
    }
}

- (void) parser:(NSXMLParser *)parser didEndElement:(NSString *)elementname namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName
{
    if ([elementname isEqualToString:@"name"]) 
    {
        currentTweet.content = currentNodeContent;
    }
    if ([elementname isEqualToString:@"vorname"]) 
    {
        currentTweet.vorname = currentNodeContent;
    }
    if ([elementname isEqualToString:@"created_at"]) 
    {
        currentTweet.dateCreated = currentNodeContent;
    }
    if ([elementname isEqualToString:@"lehrer"]) 
    {
        [tweets addObject:currentTweet];
        [currentTweet release];
        currentTweet = nil;
        [currentNodeContent release];
        currentNodeContent = nil;
    }
}

- (void) parser:(NSXMLParser *)parser foundCharacters:(NSString *)string
{
    currentNodeContent = (NSMutableString *) [string stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
}

- (void) dealloc
{
    [parser release];
    [super dealloc];
}

@end

Upvotes: 2

Views: 1005

Answers (2)

Rob
Rob

Reputation: 1253

As per Woody's answer, this is completely expected. You will need to concatenate the strings from the multiple - (void) parser:(NSXMLParser *)parser foundCharacters:(NSString *)string calls.

The correct way to do this is as follows:

- (void) parser:(NSXMLParser *)parser foundCharacters:(NSString *)string
{
    if (currentElementContent== nil)
        currentElementContent = [[NSMutableString alloc] initWithString:string];
    else
        currentElementContent = [currentElementContent stringByAppendingString:string];
}

You should always be setting the currentElementContent to nil at the very end of the didEndElement method anyway. An example for this is below:

- (void) parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName
{
    // Do what you want with the parser here

    // Set element content variable to nil
    currentElementContent = nil;
}

You may need to replace the variable: currentElementContent with whatever variable you have used in your parser to house the content found between the start and end tags.

Upvotes: 3

Woody
Woody

Reputation: 5130

This is normal behaviour - parser:foundCharacters can be called multiple times for one string (and tends to be for accented characters). Your string isn't complete until the end of the element, so store them and use the full string when you get to the end of the block. It is in the documentation for foundCharacters

Apple developer docs on NSXMLParser

The parser object may send the delegate several parser:foundCharacters: messages to report the characters of an element. Because string may be only part of the total character content for the current element, you should append it to the current accumulation of characters until the element changes.

Edit as per question:

the code in general is fine but in the characters function, do

- (void) parser:(NSXMLParser *)parser foundCharacters:(NSString *)string
{
    if(nil == currentNodeContent)
        currentNodeContent = [[NSMutableString alloc] initWithString:string];
    else
        [currentNodeContent appendString:string];
}

then in both didStart and didEnd call a method that checks to see if the string is nil, do whatever it was you were going to do with it in the first place, and then release the string (and null it).

The string is ended at both the start of a new element (ie, the text before an opening <), and at the end (the bit of text before the

Upvotes: 3

Related Questions