JohnnL
JohnnL

Reputation: 111

Extended Grapheme Clusters stop combining

I am having one question with the Extended Grapheme Clusters. For example, look at following code:

let message = "c\u{0327}a va bien" // => "ça va bien" 

How does Swift know it needs to be combined (i.e. ç) rather than treating it as a small letter c AND a "COMBINING CEDILLA"?

Upvotes: 0

Views: 185

Answers (1)

Code Different
Code Different

Reputation: 93181

Use the unicodeScalars view on the string:

let message1 = "c\u{0327}".decomposedStringWithCanonicalMapping
for scalar in message1.unicodeScalars {
    print(scalar) // print c and Combining Cedilla separately
}

let message2 = "c\u{0327}".precomposedStringWithCanonicalMapping
for scalar in message2.unicodeScalars {
    print(scalar) // print Latin Small Letter C with Cedilla
}

Note that not all composite characters have a precomposed form, as noted by Apple's Technical Q&A:

Important: Do not convert to precomposed Unicode in an attempt to simplify your text processing. Precomposed Unicode can still contain composite characters. For example, there is no precomposed equivalent of U+0065 U+030A (LATIN SMALL LETTER E followed by COMBINING RING ABOVE)

Upvotes: 1

Related Questions