Reputation: 183
I've had a problem that 's been bugging me for a few days now.
I'm parsing an RSS feed with NSXMLParser and feeding the results into a UITableView. Unfortunately, the feed returns some HTML which I parse out with the following method:
- (NSString *)flattenHTML:(NSString *)html {
NSScanner *theScanner;
NSString *text = nil;
theScanner = [NSScanner scannerWithString:html];
while ([theScanner isAtEnd] == NO) {
[theScanner scanUpToString:@"<" intoString:NULL] ;
[theScanner scanUpToString:@">" intoString:&text] ;
html = [html stringByReplacingOccurrencesOfString:[NSString stringWithFormat:@"%@>", text] withString:@""];
}
html = [html stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
return html;
}
I currently call this method during the NSXMLParser delegate method:
- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName{
This works beautifully HOWEVER it takes almost a minute or more to parse and flatten the HTML into text and fill the cell. During that interminable minute my UITableView is entirely empty with just a lone spinner spinning. That's not good. This is last "bug" to squash before I ship this otherwise wonderfully working app.
It's works pretty quickly on the iOS simulator which isn't surprising.
Thanks in advance for any advice.
Upvotes: 1
Views: 952
Reputation:
Your algorithm is not very good. For each tag you try to remove it, even if it is stripped already. Also each iteration of the loop causes a copy of the whole HTML string to be made, often without even stripping out anything. If you are not using ARC those copies also will persist until the current autorelease pool gets popped. You are not only wasting memory, you also do a lot of uneccessary work.
Testing your method (with the Cocoa wikipedia article) takes 3.5 seconds.
Here is an improved version of this code:
- (NSString *)flattenHTML:(NSString *)html {
NSScanner *theScanner = [NSScanner scannerWithString:html];
theScanner.charactersToBeSkipped = nil;
NSMutableString *result = [NSMutableString stringWithCapacity: [html length]];
while (![theScanner isAtEnd]) {
NSString *part = nil;
if ([theScanner scanUpToString:@"<" intoString: &part] && part) {
[result appendString: part];
}
[theScanner scanUpToString:@">" intoString:NULL];
[theScanner scanString: @">" intoString: NULL];
}
return [result stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
}
This will tell the scanner to get every character up to the first <
and append them to the result string if there are any. Then it will skip up to the next >
and then also skip the >
to strip out the tag. This will get repeated until the end of the text. Every character is only touched once making this an O(n)
algorithm.
This takes only 6.5 ms for the same data. That is about 530 times faster.
Btw, those measurements where made on a Mac. The exact values will of course be different on an iPhone.
Upvotes: 3
Reputation: 10828
I'm not sure what exactly is the problem? is it that the flattenHTML
method taking a lot of time to finished? or that it's blocking your app when it's running?
If the last one is your problem and assuming you are doing everything right in flattenHTML
and that it really takes a lot of time to finish. The only thing you can do is make sure you are not blocking your main thread while doing this. You can use GCD or NSOperation to achieve this, there is nothing else you can do except letting the user know you are parsing the data now and let him decide if he wants to wait or cancel the operation and do something else.
Upvotes: 0
Reputation: 2783
I entered similar problem and I couldn't let it faster. Instead of this, I showed the progress bar to show how the parsing process done.
Below code is a part of that.
// at first, count the lines of XML file
NSError *error = nil;
NSString *xmlFileString = [NSString stringWithContentsOfURL:url
encoding:NSUTF8StringEncoding
error:&error];
_totalLines = [xmlFileString componentsSeparatedByString:@"\n"].count;
// do other things...
// delegate method when the parser find new section
- (void)parser:(NSXMLParser *)parser
didStartElement:(NSString *)elementName
namespaceURI:(NSString *)namespaceURI
qualifiedName:(NSString *)qName
attributes:(NSDictionary *)attributeDict
{
// do something ...
// back to main thread to change app appearance
NSOperationQueue *mainQueue = [NSOperationQueue mainQueue];
[mainQueue addOperationWithBlock:^{
// Here is important. Get the line number and update the progress bar.
_progressView.progress = (CGFloat)[parser lineNumber] / (CGFloat)_totalLines;
}];
}
I have sample project in GitHub. You can download and just run it. I wish my code may some help for you.
https://github.com/weed/p120727_XMLParseProgress
Upvotes: 0