Another Dude
Another Dude

Reputation: 1451

Swift: How to get UTF-8 representation of characters (as 0xXX 0xXX 0xXX...)?

I'd like to get the UTF-8 representation of a character.

For example, according to this webpage, 😀 should be 0xF0 0x9F 0x98 0x80, and UTF-16 is 0xD83D 0xDE00.

I have tried this code:

extension String {

    var utf8Representation: String? {
        guard let data = self.data(using: .nonLossyASCII, allowLossyConversion: true), 
              let result = String(data: data, encoding: .utf8) else {
            return nil
        }
        return result
    }

}

But here is the result I get:

😀 = \ud83d\ude00

Which is the UTF-16 and not the UTF-8 representation that I was expecting.

What should I do?

Thanks for your help

Upvotes: 2

Views: 508

Answers (1)

Martin R
Martin R

Reputation: 540145

The .nonLossyASCII conversion converts each non-ASCII character to a "\uNNNN" escape sequence, which is why your approach does not work.

self.utf8 gives the UTF-8 representation of a String. Then format each UTF-8 code point as a "0xNN" string, and join the results with space characters:

extension String {
    var utf8Representation: String {
        return self.utf8.map { String(format: "0x%02hhx", $0) }.joined(separator: " ")
    }

}

Example:

print("😀".utf8Representation)
// 0xf0 0x9f 0x98 0x80

Upvotes: 3

Related Questions