Jack120
Jack120

Reputation: 223

Regex stringByReplacingMatchesInString

I'm trying to remove any non-alphanumeric character within a string. I tried the following code snippet, but it is not replacing the appropriate character.

NSString *theString = @"\"day's\"";

NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"^\\B\\W^\\B" options:NSRegularExpressionCaseInsensitive error:&error];

NSString *newString = [regex stringByReplacingMatchesInString:theString options:0 range:NSMakeRange(0, [theString length]) withTemplate:@""];

NSLog(@"the resulting string is %@", newString);

Upvotes: 2

Views: 3826

Answers (1)

Monolo
Monolo

Reputation: 18253

Since there'e a need to preserve the enclosing quotation marks in the string, the regex necessarily becomes a bit complex.

Here is one which does it:

(?:(?<=^")(\W+))|(?:(?!^")(\W+)(?=.))|(?:(\W+)(?="$))

It uses lookbehind and lookahead to match the quotation marks, without including them in the capture group, and hence they will not be deleted in the substitution with the empty string.

The three parts handle the initial quotation mark, all characters in the middle and the last quotation mark, respectively.

It is a bit pedestrian and there has to be a simpler way to do it, but I haven't been able to find it. Others are welcome to chime in!

NSString *theString = @"\"day's\"";

NSString *pattern   = @"(?:(?<=^\")(\\W+))|(?:(?!^\")(\\W+)(?=.))|(?:(\\W+)(?=\"$))";


NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern: pattern 
                                                                       options: 0        // No need to specify case insensitive, \W makes it irrelevant
                                                                         error: &error];

NSString *newString = [regex stringByReplacingMatchesInString: theString
                                                      options: 0
                                                        range: NSMakeRange(0, [theString length]) 
                                                 withTemplate: @""];

The (?:) construct creates a non-capturing parenthesis, meaning that you can keep the lookbehind (or lookahead) group and "real" capture group together without creating an actual capture group encapsulating the whole parenthesis. Without that you couldn't just substitute an empty string, or it would all be deleted.

Upvotes: 4

Related Questions