Reputation: 10971
I have an array of strings, from which I would like to extract only those with unique character sets. (For example, "asdf" and "fdsa" would be considered redundant). This is the method I am currently using:
NSMutableArray *uniqueCharSets = [[NSMutableArray alloc] init];
NSMutableArray *uniqueStrings = [[NSMutableArray alloc] init];
for (NSString *_string in unique) {
NSCharacterSet *_charSet = [NSCharacterSet characterSetWithCharactersInString:_string];
if (![uniqueCharSets containsObject:_charSet]) {
[uniqueStrings addobject:_string];
[uniqueCharSets addObject:_charSet];
}
}
This seems to work, but it's very slow and resource-intensive. Can anyone think of a better way to do this?
Upvotes: 6
Views: 1853
Reputation: 96937
NSDictionary
, map each string's lexicographically-sorted equivalent to an NSArray
of input strings: (e.g. adfs
=> [afsd, asdf, ...]
)Upvotes: 1
Reputation: 10865
The only thing that come in my mind is not to use containsObject
: since NSMutableArray
is not ordered (in general), we can assume that containsObject
simply iterates the array starting from the beginning until he finds the object. This means O(n)
(n
comparisons in the worst case).
A better solution may consists in keeping the array ordered and use a custom search method using a dichotomic approach. This way you'll have a O(log n)
complexity.
Of course, you must take care of keeping your array ordered (much more efficient than add and reorder), so you should use insertObject:atIndex:
method to insert the element properly.
Upvotes: 0
Reputation: 28688
I just put together a quick example of how I would approach this, but it turns out that it is more, odd, than you first expect. For one, NSCharacterSet
doesn't implement equality to check contents. It only uses the pointer value. Based on this your example will NOT work properly.
My approach is to use an NSSet to deal with the hashing of these for us.
@interface StringWrapper : NSObject
@property (nonatomic, copy) NSString *string;
@property (nonatomic, copy) NSData *charSetBitmap;
- (id)initWithString:(NSString*)aString;
@end
@implementation StringWrapper
@synthesize string, charSetBitmap;
- (id)initWithString:(NSString*)aString;
{
if ((self = [super init]))
{
self.string = aString;
}
return self;
}
- (void)setString:(NSString *)aString;
{
string = [aString copy];
self.charSetBitmap = [[NSCharacterSet characterSetWithCharactersInString:aString] bitmapRepresentation];
}
- (BOOL)isEqual:(id)object;
{
return [self.charSetBitmap isEqual:[object charSetBitmap]];
}
- (NSUInteger)hash;
{
return [self.charSetBitmap hash];
}
@end
int main (int argc, const char * argv[])
{
@autoreleasepool {
NSMutableSet *stringWrappers = [[NSMutableSet alloc] init];
NSArray *strings = [NSArray arrayWithObjects:@"abc",@"aaabcccc",@"awea",@"awer",@"abcde", @"ehra", @"QWEQ", @"werawe", nil];
for (NSString *str in strings)
[stringWrappers addObject:[[StringWrapper alloc] initWithString:str]];
NSArray *uniqueStrings = [stringWrappers valueForKey:@"string"];
NSLog(@"%@", uniqueStrings);
}
return 0;
}
The code is pretty straightforward. We create a container object to cache the results of the character set's bitmap representation. We use the bitmap representation because NSData
implements isEqual:
appropriately.
Upvotes: 0