What's the different between CharacterView and UnicodeScalarView of the String type

Question

The following two code practically did the same thing

for character in "Dog!🐶".characters {
    print(character)
}

for character in "Dog!🐶".unicodeScalars {
    print(character)
}

However, when I check for more detail information behind the sense, I found the difference. The characters property is the type of CharacterView while unicodeScalars is the type of UnicodeScalarView.

Question:

What's the difference between them?

Which property is preferred for what situation? (would be nice to have an example)

Many Thanks

Alexander · Accepted Answer

This comes down to the difference between a Character and UnicodeScalar.

Unicode Scalars

Behind the scenes, Swift’s native String type is built from Unicode scalar values. A Unicode scalar is a unique 21-bit number for a character or modifier, such as U+0061 for LATIN SMALL LETTER A ("a"), or U+1F425 for FRONT-FACING BABY CHICK ("🐥").

...

Extended Grapheme Clusters

Every instance of Swift’s Character type represents a single extended grapheme cluster. An extended grapheme cluster is a sequence of one or more Unicode scalars that (when combined) produce a single human-readable character.

Here’s an example. The letter é can be represented as the single Unicode scalar é (LATIN SMALL LETTER E WITH ACUTE, or U+00E9). However, the same letter can also be represented as a pair of scalars—a standard letter e (LATIN SMALL LETTER E, or U+0065), followed by the COMBINING ACUTE ACCENT scalar (U+0301). The COMBINING ACUTE ACCENT scalar is graphically applied to the scalar that precedes it, turning an e into an é when it is rendered by a Unicode-aware text-rendering system.

From the Strings and Characters section of the Swift Programming Language Guide.

In most cases that I can think of you'll want to be dealing with Character instances, as they as the smallest unit of Human language. I can't imagine a situation where you'd want to operate on a modifier without considering the full extended grapheme cluster.

What's the different between CharacterView and UnicodeScalarView of the String type

Answers (1)

Unicode Scalars

Extended Grapheme Clusters

Related Questions

What&#39;s the different between CharacterView and UnicodeScalarView of the String type

Answers (1)

Unicode Scalars

Extended Grapheme Clusters

Related Questions

What's the different between CharacterView and UnicodeScalarView of the String type