mars
mars

Reputation: 19

Unable to read binary file in Python

I am trying to read a binary file using the following format

with open("binaryfile.bin" , 'rb') as f1:  
    for line in f1.readlines():  
        print(line) 

It is returning gibberish data like

@ ç─+@@@d*d)      

⌡Å2

  
q_ Ç        

I have verified that the data in the file is correct and I can read it using the od command on the command line

od -w8 -Ad -x binaryfile.bin

Output:

0000000 0011 0022 0066 0066
0000008 0066 0066 0066 0066
*  
0000032 1234 0000 0000 0000
0000040 0000 0000 0000 0000
* 
0000080 0000 0000 0000 0056
0000088 0011 

The problem with the 'od' command is that when two or more consecutive lines are similar then it replaces them with "*\n". This issue becomes more prevalent if I read only two bytes per line as more data is common.

od -w2 -Ad -x binaryfile.bin

Output:

0000000 0011
0000002 0022
0000004 0066
*
0000032 1234
0000034 0000
*
0000086 0056
0000088 0011

I want to read each and every line.

Q1: Can anyone suggest why is the regular 'rb' command not working?
Q2: Is there an option to read the complete file using the 'ob' command without removing the common lines?

Upvotes: 0

Views: 1323

Answers (1)

GProst
GProst

Reputation: 10237

open("binaryfile.bin" , 'rb') works correctly, it reads data in bytes, then you print this information to console and it tries to convert these bytes chunk to 'utf-8' format and produces weird characters since you're reading not a text file.

You could use binascii.hexify method to convert bytes string to the hex representation you want:

import binascii

with open("binaryfile.bin" , 'rb') as f1:  
  for line in f1.readlines():
    # NOTE: arguments 'sep' and 'bytes_per_sep' are only since Python v3.8
    print(binascii.hexlify(line, sep=' ', bytes_per_sep=2))

Upvotes: 3

Related Questions