tgk
tgk

Reputation: 4116

Encoding Swift string as escaped unicode?

The API data field only supports ASCII encoding -- but I need to support Unicode (emoji, foreign characters, etc.)

I'd like to encode the user's text input as an escaped unicode string:

let textContainingUnicode = """
Let's go 🏊 in the 🌊.
  And some new lines.
"""

let result = textContainingUnicode.unicodeScalars.map { $0.escaped(asASCII: true)}
  .joined(separator: "")
  .replacingOccurrences(
    of: "\\\\u\\{(.+?(?=\\}))\\}", <- converting swift format \\u{****}
    with: "\\\\U$1",               <- into format python expects
    options: .regularExpression)

result here is "Let\'s go \U0001F3CA in the \U0001F30A.\n And some new lines."

And on the server decoding with python:

codecs.decode("Let\\'s go \\U0001F3CA in the \\U0001F30A.\\n And some new lines.\n", 'unicode_escape')

But this smells funny -- do i really need to do so much string manipulation in swift to get the escaped unicode? Are these formats not standardized across languages.

Upvotes: 5

Views: 2236

Answers (1)

Leo Dabus
Leo Dabus

Reputation: 236568

You can use reduce in your collection and check if each character isASCII, if true return that character otherwise convert the special character to unicode:

Swift 5.1 • Xcode 11

extension Unicode.Scalar {
    var hexa: String { .init(value, radix: 16, uppercase: true) }
}

extension Character {
    var hexaValues: [String] {
        unicodeScalars
            .map(\.hexa)
            .map { #"\\U"# + repeatElement("0", count: 8-$0.count) + $0 }
    }
}

extension StringProtocol where Self: RangeReplaceableCollection {
    var asciiRepresentation: String { map { $0.isASCII ? .init($0) : $0.hexaValues.joined() }.joined() }
}

let textContainingUnicode = """
Let's go 🏊 in the 🌊.
  And some new lines.
"""

let asciiRepresentation = textContainingUnicode.asciiRepresentation
print(asciiRepresentation)  // "Let's go \\U0001F3CA in the \\U0001F30A.\n  And some new lines."

Upvotes: 5

Related Questions