cluster1
cluster1

Reputation: 5752

Converting string to data - What happens, when the wrong encoding it used?

Let`s say I've got a string with characters, which doesn't exist in ASCII.

When I use the correct encoding everything works fine.

let example = "Testing, ÜÄÖ ?ß 123 ..."
let data = example.data(using: .utf8)
let example2 = String(decoding: data!, as: UTF8.self)
print(example2) // Testing, ÜÄÖ ?ß 123 ...

When I change the encoding to 'String.Encoding.ascii' nil becomes returned. But what happens there in the background? It can't find a bit-combination for the character?

How is each character transformed to data and what happens if the transformation fails? Can someone explain it in simple terms?

Upvotes: 0

Views: 37

Answers (1)

Kiryl Famin
Kiryl Famin

Reputation: 337

As Wikipedia for UTF-8 states:

It was designed for backward compatibility with ASCII: the first 128 characters of Unicode, which correspond one-to-one with ASCII, are encoded using a single byte with the same binary value as ASCII, so that a UTF-8-encoded file using only those characters is identical to an ASCII file.

Basically, ASCII is a subset of UTF-8 so the encoding just fails if a byte representation of your UTF-8 character is longer than 1 byte.

Upvotes: 1

Related Questions