How to remove a flag from the end of a string?

Question

I found a strange behavior for function String.characters.count for rows in which row are the Emoji flags:

import UIKit

var flag = "🇨🇦🇨🇦🇨🇦🇨🇦🇨🇦🇨🇦"
print(flag.characters.count)
print(flag.unicodeScalars.count)
print(flag.utf16.count)
print(flag.utf8.count)
flag = "🇨🇦0🇨🇦0🇨🇦0"
print(flag.characters.count)
print(flag.unicodeScalars.count)
print(flag.utf16.count)
print(flag.utf8.count)

I want to limit the string length of the text when writing and editing in the UITextView. Actually my code this:

var lastRange: NSRange? = nil
var lastText: String? = nil

func textView(textView: UITextView, shouldChangeTextInRange range: NSRange, replacementText string: String) -> Bool {
    if string == "
" {
        // Execute same code
        return false
    } 
    var text = string.uppercaseString
    if lastText != text || lastRange != nil && (lastRange!.location != range.location || lastRange!.length != range.length) {
        lastRange = range
        lastText = text

        var text = (self.textView.text ?? "" as NSString).stringByReplacingCharactersInRange(range, withString: string)

        // Delete chars if length more kMaxLengthText 
        while text.utf16.count >= kMaxLengthText {
            text.removeAtIndex(text.endIndex.advancedBy(-1))
        }
        // Set position after insert text
        self.textView.selectedRange = NSRange(location: range.location + lastText!.utf16.count, length: 0)
    }
    return false
}

Martin R · Accepted Answer

Update for Swift 4 (Xcode 9)

As of Swift 4 (tested with Xcode 9 beta) flags (i.e. pairs of regional indicators) are treated as a single grapheme cluster, as mandated by the Unicode 9 standard. So counting flags and removing the last character (wether it is a flag or not) is now as simply as:

var flags = "🇩🇪🇩🇪🇩🇪🇨🇦🇨🇦🇨🇦"
print(flags.count) // 6

flags.removeLast()
print(flags.count) // 5
print(flags) // 🇩🇪🇩🇪🇩🇪🇨🇦🇨🇦

(Old answer for Swift 3 and earlier:)

There is no bug. A sequence of "Regional Indicator" characters is a single "extended grapheme cluster", that is why

var flag = "🇨🇦🇨🇦🇨🇦🇨🇦🇨🇦🇨🇦"
print(flag.characters.count)

prints 1 (compare Swift countElements() return incorrect value when count flag emoji).

On the other hand, the above string consists of 12 Unicode scalars (🇨🇦 is 🇨+ 🇦), and each of them needs two UTF-16 code points.

To separate the string into "visible entities" you have to consider "composed character sequences", compare How to know if two emojis will be displayed as one emoji?.

I do not have an elegant solution (perhaps someone has a better one). But one option would be to separate the string into an array of composed characters, remove elements from the array if necessary, and then combine the strings again.

Example:

extension String {

    func composedCharacters() -> [String] {
        var result: [String] = []
        enumerateSubstringsInRange(characters.indices, options: .ByComposedCharacterSequences) {
            (subString, _, _, _) in
            if let s = subString { result.append(s) }
        }
        return result
    }
}

var flags = "🇩🇪🇩🇪🇩🇪🇨🇦🇨🇦🇨🇦"
var chars = flags.composedCharacters()
print(chars.count) // 6
chars.removeLast()
flags = chars.joinWithSeparator("")
print(flags) // 🇩🇪🇩🇪🇩🇪🇨🇦🇨🇦

How to remove a flag from the end of a string?

Answers (2)

Related Questions