ssjblue
ssjblue

Reputation: 11

Read input file as hexidecimal and output certain values, always fails on reading the file

What im trying to do is extract png images from files, so by reading the hex data its easy to find where they are hidden. They always start and end with certain values concerning png images. I wrote a script that would open a .bin file and search for those values and export as png. The problem is, in python 2.7 nothing happens, and in python 3, I get errors about the encoding of the file. Ive tried ignorerrors and utf-8 encoding flags but probelms still persist. The code in question:

import binascii
import re
import os

for directory, subdirectories, files in os.walk('.'):
    for file in files:

        if not file.endswith('.bin'):
            continue

        filenumber = 0

        with open(os.path.join(directory, file)) as f:

            hexaPattern = re.compile(
                r'(89504E47.*?AE426082)',
                re.IGNORECASE
            )

            for match in hexaPattern.findall(binascii.hexlify(f.read())):

                with open('{}-{}.png'.format(file, filenumber), 'wb+') as f:
                    f.write(binascii.unhexlify(match))

                filenumber += 1

So as you can see, extract hex values beginning with "89504E47" from imported file, anything in between that and "AE426082". I think the entire code for getting these values is fine, but I'm having trouble with python actually reading the file as hexidecimal. Thoughts?

Upvotes: 0

Views: 33

Answers (1)

ssjblue
ssjblue

Reputation: 11

Thank you @Thierry Lathuille that fixed it. I used python 3.9, then did the changes with:

with open(os.path.join(directory, file), 'rb+') as f:

and everything output correctly!

Upvotes: 1

Related Questions