Reputation: 1451
I'd like to get the UTF-8 representation of a character.
For example, according to this webpage, 😀
should be 0xF0 0x9F 0x98 0x80
, and UTF-16 is 0xD83D 0xDE00
.
I have tried this code:
extension String {
var utf8Representation: String? {
guard let data = self.data(using: .nonLossyASCII, allowLossyConversion: true),
let result = String(data: data, encoding: .utf8) else {
return nil
}
return result
}
}
But here is the result I get:
😀 = \ud83d\ude00
Which is the UTF-16 and not the UTF-8 representation that I was expecting.
What should I do?
Thanks for your help
Upvotes: 2
Views: 508
Reputation: 540145
The .nonLossyASCII
conversion converts each non-ASCII character to a "\uNNNN"
escape sequence, which is why your approach does not work.
self.utf8
gives the UTF-8 representation of a String
. Then format each UTF-8 code point as a "0xNN"
string, and join the results with space characters:
extension String {
var utf8Representation: String {
return self.utf8.map { String(format: "0x%02hhx", $0) }.joined(separator: " ")
}
}
Example:
print("😀".utf8Representation)
// 0xf0 0x9f 0x98 0x80
Upvotes: 3