SLN
SLN

Reputation: 5082

What's the different between CharacterView and UnicodeScalarView of the String type

The following two code practically did the same thing

for character in "Dog!đŸ¶".characters {
    print(character)
}

for character in "Dog!đŸ¶".unicodeScalars {
    print(character)
}

However, when I check for more detail information behind the sense, I found the difference. The characters property is the type of CharacterView while unicodeScalars is the type of UnicodeScalarView.

Question:

What's the difference between them?

Which property is preferred for what situation? (would be nice to have an example)

Many Thanks

Upvotes: 1

Views: 720

Answers (1)

Alexander
Alexander

Reputation: 63271

This comes down to the difference between a Character and UnicodeScalar.

Unicode Scalars

Behind the scenes, Swift’s native String type is built from Unicode scalar values. A Unicode scalar is a unique 21-bit number for a character or modifier, such as U+0061 for LATIN SMALL LETTER A ("a"), or U+1F425 for FRONT-FACING BABY CHICK ("đŸ„").

...

Extended Grapheme Clusters

Every instance of Swift’s Character type represents a single extended grapheme cluster. An extended grapheme cluster is a sequence of one or more Unicode scalars that (when combined) produce a single human-readable character.

Here’s an example. The letter Ă© can be represented as the single Unicode scalar Ă© (LATIN SMALL LETTER E WITH ACUTE, or U+00E9). However, the same letter can also be represented as a pair of scalars—a standard letter e (LATIN SMALL LETTER E, or U+0065), followed by the COMBINING ACUTE ACCENT scalar (U+0301). The COMBINING ACUTE ACCENT scalar is graphically applied to the scalar that precedes it, turning an e into an Ă© when it is rendered by a Unicode-aware text-rendering system.

From the Strings and Characters section of the Swift Programming Language Guide.

In most cases that I can think of you'll want to be dealing with Character instances, as they as the smallest unit of Human language. I can't imagine a situation where you'd want to operate on a modifier without considering the full extended grapheme cluster.

Upvotes: 2

Related Questions