stackOverFlew
stackOverFlew

Reputation: 1499

How to check NSString isEqualToString for a lot of values

I am looking for the most efficient way to match an NSString against about 200,000 other strings in the minimum amount of time. Basically I need to check, every once in a while, whether or not a word that is inputted is part of the English language.

What are the ways I could do this? I've heard about hashtables-- is that the best way?


Here is the code I settled on:

EDIT: benchmarks for dictionary initialization to memory:

NSDate *start = [NSDate date];

NSLog(@"initWords");

//temporary 
NSString *_filePath = [[NSBundle mainBundle] pathForResource:kFILE_NAME_FOR_DICTIONARY ofType:@"txt"];

NSLog(@"%@",_filePath);

NSString *_fileContents = [[NSString alloc] init];
NSData *_binary = [NSData dataWithContentsOfFile:_filePath];  

if (_binary) {  
    _fileContents = [[NSString alloc] initWithData:_binary encoding:NSUTF8StringEncoding];
}  else {

    NSLog(@"file parse error: did you forget to add the file? Please add file to:\n\n\n\n%@\n\n\n\n",[[NSBundle mainBundle] resourcePath]);
}

NSArray *_wordList = [_fileContents componentsSeparatedByString:kNEW_LINE_CHAR];

englishDictionary = [[NSMutableSet alloc] init];
[englishDictionary addObjectsFromArray:_wordList];

NSLog(@"Word count:\t%d",englishDictionary.count);
NSLog(@"Time to init dictionary:\t%f",[start timeIntervalSinceNow]*-1.);

iphone 5: 1.089725 (seconds)

ipad 1: 3.082753

iphone 4: 3.582853

Benchmark (time to test word if it is in the dictionary or not):

-(BOOL)checkWord:(NSString *)word{

    NSDate *start = [NSDate date];

    BOOL yesNoMaybeSo = [englishDictionary containsObject:word];

    NSLog(@"Time to check word:\t%f",[start timeIntervalSinceNow]*-1.);
    return yesNoMaybeSo;

}

iphone 5: 0.000021 (seconds)

ipad 1: 0.000037

iphone 4: 0.000043

Upvotes: 0

Views: 345

Answers (3)

Kevin
Kevin

Reputation: 56059

If you don't mind rolling your own, you can build a trie (a.k.a prefix tree) from the dictionary and use it to look up the words. I'd bet a reasonably good implementation using c strings and structs would be faster and more memory-efficient than an NSDictionary.

Upvotes: 0

Nickolay Olshevsky
Nickolay Olshevsky

Reputation: 14160

With such requirements standard methods/classes will not fit your needs. You should learn/read some books on algorithms to implement this correctly. Looks like you should use hash tables, but, again, with 200K values NSDictionary (which is hash table) probably will not work fast enough.

Upvotes: 1

Gabriele Petronella
Gabriele Petronella

Reputation: 108101

The most efficient way is probably to use a NSSet to store all the words you want to compare against your string.

Then you simply check whether your word belongs to the set like follows

BOOL englishWord = [theEnglishSet containsObject:yourString];

This will take constant time to perform.

Upvotes: 2

Related Questions