Reputation: 19
I am trying to read a binary file using the following format
with open("binaryfile.bin" , 'rb') as f1:
for line in f1.readlines():
print(line)
It is returning gibberish data like
@ ç─+@@@d*d)
⌡Å2
q_ Ç
I have verified that the data in the file is correct and I can read it using the od command on the command line
od -w8 -Ad -x binaryfile.bin
Output:
0000000 0011 0022 0066 0066
0000008 0066 0066 0066 0066
*
0000032 1234 0000 0000 0000
0000040 0000 0000 0000 0000
*
0000080 0000 0000 0000 0056
0000088 0011
The problem with the 'od' command is that when two or more consecutive lines are similar then it replaces them with "*\n". This issue becomes more prevalent if I read only two bytes per line as more data is common.
od -w2 -Ad -x binaryfile.bin
Output:
0000000 0011
0000002 0022
0000004 0066
*
0000032 1234
0000034 0000
*
0000086 0056
0000088 0011
I want to read each and every line.
Q1: Can anyone suggest why is the regular 'rb' command not working?
Q2: Is there an option to read the complete file using the 'ob' command without removing the common lines?
Upvotes: 0
Views: 1323
Reputation: 10237
open("binaryfile.bin" , 'rb')
works correctly, it reads data in bytes, then you print this information to console and it tries to convert these bytes chunk to 'utf-8' format and produces weird characters since you're reading not a text file.
You could use binascii.hexify
method to convert bytes string to the hex representation you want:
import binascii
with open("binaryfile.bin" , 'rb') as f1:
for line in f1.readlines():
# NOTE: arguments 'sep' and 'bytes_per_sep' are only since Python v3.8
print(binascii.hexlify(line, sep=' ', bytes_per_sep=2))
Upvotes: 3