Reputation: 11
What im trying to do is extract png images from files, so by reading the hex data its easy to find where they are hidden. They always start and end with certain values concerning png images. I wrote a script that would open a .bin file and search for those values and export as png. The problem is, in python 2.7 nothing happens, and in python 3, I get errors about the encoding of the file. Ive tried ignorerrors and utf-8 encoding flags but probelms still persist. The code in question:
import binascii
import re
import os
for directory, subdirectories, files in os.walk('.'):
for file in files:
if not file.endswith('.bin'):
continue
filenumber = 0
with open(os.path.join(directory, file)) as f:
hexaPattern = re.compile(
r'(89504E47.*?AE426082)',
re.IGNORECASE
)
for match in hexaPattern.findall(binascii.hexlify(f.read())):
with open('{}-{}.png'.format(file, filenumber), 'wb+') as f:
f.write(binascii.unhexlify(match))
filenumber += 1
So as you can see, extract hex values beginning with "89504E47" from imported file, anything in between that and "AE426082". I think the entire code for getting these values is fine, but I'm having trouble with python actually reading the file as hexidecimal. Thoughts?
Upvotes: 0
Views: 33
Reputation: 11
Thank you @Thierry Lathuille that fixed it. I used python 3.9, then did the changes with:
with open(os.path.join(directory, file), 'rb+') as f:
and everything output correctly!
Upvotes: 1