rv7284
rv7284

Reputation: 1102

String to UTF-32 string

I have gone through many questions here but none of them seems to work for me. I simply want to convert my string to UTF-32 string. Like shown in the image enter image description here

var str = "Your"

let dataenc = str.data(using: String.Encoding.utf32)

extension Data {
    func hexEncodedString() -> String {
        return map { String(format: "%04hhx", $0) }.joined()
    }
}

let data = str.data(using: .utf16)!
let hexString = data.map{ String(format:"%02x", $0) }.joined()

print(data.hexEncodedString())
print(hexString)

this doesn't work

the output I get is

00ff00fe00590000006f00000075000000720000

fffe59006f0075007200

not sure what to do. Thanks in advance.

Upvotes: 0

Views: 797

Answers (1)

rmaddy
rmaddy

Reputation: 318955

To get the same result you need to use the .utf32BigEndian string encoding.

extension Data {
    func hexEncodedString() -> String {
        return map { String(format: "%02x", $0) }.joined()
    }
}

var str = "Your"
let dataenc = str.data(using: .utf32BigEndian)!
print(dataenc.hexEncodedString())

Output:

000000590000006f0000007500000072

Note that when using just .utf32 you get 20 bytes for the string "Your" but with .utf32BigEndian you only get 16 bytes for the same string. Those extra 4 bytes represent the "BOM" (byte order marker). In your case, the result of using .utf32 gave you the data in "little-endian" format with the "BOM" at the start of the data. That's why the data started with the extra 00ff00fe (the BOM) and each of the next 4 bytes appeared in the wrong order.

Explicitly stating .utf32BigEndian puts the bytes in the desired order and eliminates the BOM.

Upvotes: 2

Related Questions