Reputation: 23
Test File Contents (in Binary)
00010203 04050607 08090A0B 0C0D0E0F
10111213 14151617 18191A1B 1C1D1E1F
20212223 24252627 28292A2B 2C2D2E2F
30313233 34353637 38393A3B 3C3D3E3F
40414243 44454647 48494A4B 4C4D4E4F
50515253 54555657 58595A5B 5C5D5E5F
60616263 64656667 68696A6B 6C6D6E6F
70717273 74757677 78797A7B 7C7D7E7F
80818283 84858687 88898A8B 8C8D8E8F
90919293 94959697 98999A9B 9C9D9E9F
A0A1A2A3 A4A5A6A7 A8A9AAAB ACADAEAF
B0B1B2B3 B4B5B6B7 B8B9BABB BCBDBEBF
C0C1C2C3 C4C5C6C7 C8C9CACB CCCDCECF
D0D1D2D3 D4D5D6D7 D8D9DADB DCDDDEDF
E0E1E2E3 E4E5E6E7 E8E9EAEB ECEDEEEF
F0F1F2F3 F4F5F6F7 F8F9FAFB FCFDFEFF
Test Code
#open file 1
f1 = open(test.txt, 'rb')
#declare variables
address = 0
#read a byte
while(address < 256):
byte = f1.read(1)
print(byte)
address = address + 1
What is Returned
b'\x00'
b'\x01'
b'\x02'
b'\x03'
b'\x04'
b'\x05'
b'\x06'
b'\x07'
b'\x08'
b'\t'
b'\n'
b'\x0b'
b'\x0c'
b'\r'
b'\x0e'
b'\x0f'
b'\x10'
b'\x11'
b'\x12'
b'\x13'
b'\x14'
b'\x15'
b'\x16'
b'\x17'
b'\x18'
b'\x19'
b'\x1a'
b'\x1b'
b'\x1c'
b'\x1d'
b'\x1e'
b'\x1f'
b' '
b'!'
b'"'
b'#'
b'$'
b'%'
b'&'
b"'"
b'('
b')'
b'*'
b'+'
b','
b'-'
b'.'
b'/'
b'0'
b'1'
b'2'
b'3'
b'4'
b'5'
b'6'
b'7'
b'8'
b'9'
b':'
b';'
b'<'
b'='
b'>'
b'?'
b'@'
b'A'
b'B'
b'C'
b'D'
b'E'
b'F'
b'G'
b'H'
b'I'
b'J'
b'K'
b'L'
b'M'
b'N'
b'O'
b'P'
b'Q'
b'R'
b'S'
b'T'
b'U'
b'V'
b'W'
b'X'
b'Y'
b'Z'
b'['
b'\\'
b']'
b'^'
b'_'
b'`'
b'a'
b'b'
b'c'
b'd'
b'e'
b'f'
b'g'
b'h'
b'i'
b'j'
b'k'
b'l'
b'm'
b'n'
b'o'
b'p'
b'q'
b'r'
b's'
b't'
b'u'
b'v'
b'w'
b'x'
b'y'
b'z'
b'{'
b'|'
b'}'
b'~'
b'\x7f'
b'\x80'
b'\x81'
b'\x82'
b'\x83'
b'\x84'
b'\x85'
b'\x86'
b'\x87'
b'\x88'
b'\x89'
b'\x8a'
b'\x8b'
b'\x8c'
b'\x8d'
b'\x8e'
b'\x8f'
b'\x90'
b'\x91'
b'\x92'
b'\x93'
b'\x94'
b'\x95'
b'\x96'
b'\x97'
b'\x98'
b'\x99'
b'\x9a'
b'\x9b'
b'\x9c'
b'\x9d'
b'\x9e'
b'\x9f'
b'\xa0'
b'\xa1'
b'\xa2'
b'\xa3'
b'\xa4'
b'\xa5'
b'\xa6'
b'\xa7'
b'\xa8'
b'\xa9'
b'\xaa'
b'\xab'
b'\xac'
b'\xad'
b'\xae'
b'\xaf'
b'\xb0'
b'\xb1'
b'\xb2'
b'\xb3'
b'\xb4'
b'\xb5'
b'\xb6'
b'\xb7'
b'\xb8'
b'\xb9'
b'\xba'
b'\xbb'
b'\xbc'
b'\xbd'
b'\xbe'
b'\xbf'
b'\xc0'
b'\xc1'
b'\xc2'
b'\xc3'
b'\xc4'
b'\xc5'
b'\xc6'
b'\xc7'
b'\xc8'
b'\xc9'
b'\xca'
b'\xcb'
b'\xcc'
b'\xcd'
b'\xce'
b'\xcf'
b'\xd0'
b'\xd1'
b'\xd2'
b'\xd3'
b'\xd4'
b'\xd5'
b'\xd6'
b'\xd7'
b'\xd8'
b'\xd9'
b'\xda'
b'\xdb'
b'\xdc'
b'\xdd'
b'\xde'
b'\xdf'
b'\xe0'
b'\xe1'
b'\xe2'
b'\xe3'
b'\xe4'
b'\xe5'
b'\xe6'
b'\xe7'
b'\xe8'
b'\xe9'
b'\xea'
b'\xeb'
b'\xec'
b'\xed'
b'\xee'
b'\xef'
b'\xf0'
b'\xf1'
b'\xf2'
b'\xf3'
b'\xf4'
b'\xf5'
b'\xf6'
b'\xf7'
b'\xf8'
b'\xf9'
b'\xfa'
b'\xfb'
b'\xfc'
b'\xfd'
b'\xfe'
b'\xff'
Running the the code returns this. For my program to work correctly, I need the values like b'!' to be returned as b'\x20'. What can I do to accomplish this? Thank for your help!
Upvotes: 2
Views: 1350
Reputation: 1125388
The byte values are correct. Python just choses to show you ASCII characters when possible, to aid debugging:
>>> bytes([0x21])
b'!'
>>> bytes([0x21])[0]
33
The actual byte value is still 33 decimal, 21 hexadecimal, but that byte maps to an ASCII character. Any printable ASCII codepoint will be displayed as such whenever you produce the representation (repr()
) output for a bytes
object, as that is far more readable. Certain characters (newline, carriage return) are displayed using their corresponding literal escape syntax, e.g. \n
or \r
, while only the remainder uses \xhh
hex codes. Would you rather Python displays b'\x48\x65\x6c\x6c\x6f\x20\x77\x6f\x72\x6c\x64\x0a'
or b'Hello world\n'
when debugging code handling bytes?
If you want to display hex values, explicitly format the byte value:
print(format(byte[0], '02x'))
to display it as a 2-digit lowercase hex, or
print(format(byte[0], '#04x'))
to include a leading 0x
. Use X
for uppercase.
Demo:
>>> format(bytes([0x21])[0], '02x')
'21'
>>> format(bytes([0x21])[0], '#04x')
'0x21'
If you want to display a series of bytes, you can use the binascii.hexlify()
function:
>>> from binascii import hexlify
>>> hexlify(b'Hello world\n')
b'48656c6c6f20776f726c640a'
>>> print(hexlify(b'Hello world\n').decode('ASCII'), b'Hello world\n', sep='\t')
48656c6c6f20776f726c640a b'Hello world\n'
With a bit of formatting, you can make any binary file display in both hexadecimal and ASCII representations.
Upvotes: 4