malana
malana

Reputation: 5230

What's the point of chr(128) .. chr(255) in Python?

Edit: I'm talking about behavior in Python 2.7.

The chr function converts integers between 0 and 127 into the ASCII characters. E.g.

>>> chr(65)
'A'

I get how this is useful in certain situations and I understand why it covers 0..127, the 7-bit ASCII range.

The function also takes arguments from 128..255. For these numbers, it simply returns the hexadecimal representation of the argument. In this range, different bytes mean different things depending on which part of the ISO-8859 standard is used.

I'd understand if chr took another argument, e.g.

>>> chr(228, encoding='iso-8859-1') # hypothetical
'ä'

However, there is no such option:

chr(i) -> character

Return a string of one character with ordinal i; 0 <= i < 256.

My questions is: What is the point of raising ValueError for i > 255 instead of i > 127? All the function does for 128 <= i < 256 is return hex values?

Upvotes: 10

Views: 19969

Answers (3)

kindall
kindall

Reputation: 184345

In Python 2.x, a str is a sequence of bytes, so chr() returns a string of one byte and accepts values in the range 0-255, as this is the range that can be represented by a byte. When you print the repr() of a string with a byte in the range 128-255, the character is printed in escape format because there is no standard way to represent such characters (ASCII defines only 0-127). You can convert it to Unicode using unicode() however, and specify the source encoding:

unicode(chr(200), encoding="latin1")

In Python 3.x, str is a sequence of Unicode characters and chr() takes a much larger range. Bytes are handled by the bytes type.

Upvotes: 11

hdante
hdante

Reputation: 8030

Note that python 2 string handling is broken. It's one of the reasons I recommend switching to python 3.

In python 2, the string type was designed to represent both text and binary strings. So, chr() is used to convert an integer to a byte. It's not really related to text, or ASCII, or ISO-8859-1. It's a binary stream of bytes:

 binary_command = chr(100) + chr(200) + chr(10)
 device.write(binary_command)
 etc()

In python 2.7, the bytes() type was added for forward compatibility with python 3 and it maps to str().

Upvotes: 0

Simeon Visser
Simeon Visser

Reputation: 122486

I see what you're saying but it isn't correct. In Python 3.4 chr is documented as:

Return the string representing a character whose Unicode codepoint is the integer i.

And here are some examples:

>>> chr(15000)
'㪘'
>>> chr(5000)
'ᎈ'

In Python 2.x it was:

Return a string of one character whose ASCII code is the integer i.

The function chr has been around for a long time in Python and I think the understanding of various encodings only developed in recent releases. In that sense it makes sense to support the basic ASCII table and return hex values for the extended ASCII set within the 128 - 255 range.

Even within Unicode the ASCII set is only defined as 128 characters, not 256, so there isn't (wasn't) a standard and accepted way of letting ord() return an answer for those input values.

Upvotes: 0

Related Questions