Reputation: 111
I am having one question with the Extended Grapheme Clusters. For example, look at following code:
let message = "c\u{0327}a va bien" // => "ça va bien"
How does Swift know it needs to be combined (i.e. ç) rather than treating it as a small letter c AND a "COMBINING CEDILLA"?
Upvotes: 0
Views: 185
Reputation: 93181
Use the unicodeScalars
view on the string:
let message1 = "c\u{0327}".decomposedStringWithCanonicalMapping
for scalar in message1.unicodeScalars {
print(scalar) // print c and Combining Cedilla separately
}
let message2 = "c\u{0327}".precomposedStringWithCanonicalMapping
for scalar in message2.unicodeScalars {
print(scalar) // print Latin Small Letter C with Cedilla
}
Note that not all composite characters have a precomposed form, as noted by Apple's Technical Q&A:
Important: Do not convert to precomposed Unicode in an attempt to simplify your text processing. Precomposed Unicode can still contain composite characters. For example, there is no precomposed equivalent of U+0065 U+030A (LATIN SMALL LETTER E followed by COMBINING RING ABOVE)
Upvotes: 1