john2000
john2000

Reputation: 75

what does this sentence mean?

I read python PEP100 today. In the part of 'Unicode Default Encoding', It refer that 'The Unicode implementation has to make some assumption about the encoding of 8-bit strings passed to it for coercion and about the encoding to as default for conversion of Unicode to strings when no specific encoding is given.'

My question is, What does '8-bit strings' means? Does it mean ASCII?

Upvotes: 0

Views: 126

Answers (2)

Martijn Pieters
Martijn Pieters

Reputation: 1123590

No, ASCII is a 7-bit encoding. Most text encodings (including UTF-8 and ISO-8859) are 8-bit encodings.

Generally speaking, anything beyond the basic ASCII character set needs more than 7 bits to encode. So when dealing with international data, you usually deal with encodings that can use multiple bytes per encoded character. Python will automatically try to decode byte strings to Unicode when you try to combine Unicode and byte string types, and the default encoding (in python 2) is ASCII. This is a frequent source of UnicodeDecodeError exceptions in Python.

You really want to read up on Unicode and text encodings before you proceed though. I can recommend:

Upvotes: 4

avasal
avasal

Reputation: 14864

UTF-8 is used to support a large range of characters. In UTF-8, up to 4 bytes can be used to represent a single character.

ASCII only defines 128 character. So only 7 bits. But is normally stored with 8 bits/character. RS232 (old serial communication) can be used with bytes of 7 bits

Upvotes: 2

Related Questions