NSRegularExpression anomaly with string containing accented "é" character

Question

I'm using the stringByReplacingMatchesInString method of NSRegularExpression to separate input strings into parts so that I can rearrange them. This was working well until I tested it against a string containing an accented "é".

Here's an XCode playground demonstrating the problem. In this cut down example (it's not very "real world" but it does demonstrate the problem), I'm matching everything then creating a new string using a template which simply repeats those matches: "$1 - $1".

import Cocoa

var err: NSError?
var regex = NSRegularExpression(pattern: "^(.*?)$", options: nil, error: &err)

let test = "homér simpson"
let r = NSMakeRange(0, count(test))

var str = regex!.stringByReplacingMatchesInString(test, options: nil, range: r, withTemplate: "$1 - $1")

The string "str" ends up being "homér simpso - homér simpson". As you can see, the first instance of $1 is truncated by 1 character, and I've found that this is because of the accented "é". If you edit it to use a plain "e", it's fine.

But here's the weird thing. If you edit it again to put the accented "é" back in the string, it behaves like it should and doesn't truncate.

I'm inclined to suspect the range passed to the method, but I thought that count() was smart enough to handle the presence of unicode characters?

NSRegularExpression anomaly with string containing accented "é" character

Answers (1)

Related Questions

NSRegularExpression anomaly with string containing accented &quot;&#233;&quot; character

Answers (1)

Related Questions

NSRegularExpression anomaly with string containing accented "é" character