Aske Olesen
Aske Olesen

Reputation: 17

Problems understanding rb mode and read()

I have a .txt file called 'testinput.txt' containing a string of eight zeros. And nothing else.

Doing the following in python prints the value 48 to the terminal. Can someone explain what is goint on? (I would have expected 0 to be printed)

f = open('testinput.txt','rb')
print(f.read(1)[0])

Hope anyone can help clarify this.

Thanks in advance.

Upvotes: 1

Views: 111

Answers (1)

deceze
deceze

Reputation: 522442

While bytes literals and representations are based on ASCII text, bytes objects actually behave like immutable sequences of integers, with each value in the sequence restricted such that 0 <= x < 256 [..]. This is done deliberately to emphasise that while many binary formats include ASCII based elements and can be usefully manipulated with some text-oriented algorithms, this is not generally the case for arbitrary binary data (blindly applying text processing algorithms to binary data formats that are not ASCII compatible will usually lead to data corruption).

https://docs.python.org/3/library/stdtypes.html#bytes

So, accessing individual offsets within a bytes gives you a number (the numeric byte value; see emphasised part above). Accessing a sequence of bytes gives you a bytes sequence:

>>> b'0'
b'0'
>>> b'0'[0]
48
>>> b'0'[0:1]
b'0'

Upvotes: 2

Related Questions