Reputation: 3335
I'm trying to write a script to see if a given file has a Java classfile header, i.e. the first 4 bytes of the file are 0xCAFEBABE.
However I'm not quite sure how to perform the equality checks.
Here's my current scratch code:
class JavaClassParser(object):
def __init__(self, filename):
self.filename = filename
if not os.path.isfile(self.filename):
print "Please supply a valid source path"
sys.exit(1)
with open(self.filename, 'rb') as f:
self.data = f.read()
self.verify_header()
def verify_header(self):
""" Verifies 0xCAFEBABE header present
(Java class file header) """
header = struct.unpack("cccc", self.data[:4])
if header != 0xCAFEBABE:
print "File", self.filename, "does not appear to be a valid" +\
" Java classfile. Header was", repr(header), "expected", repr(0xCAFEBABE)
sys.exit(1)
When I feed it a valid Java classfile, I receive:
File myclass.class does not appear to be a valid Java classfile. Header was ('\xca', '\xfe', '\xba', '\xbe') expected 3405691582
So 0xCAFEBABE
is being interpreted as an int by Python -- I feel like I have a critical misunderstanding of something here.
I could rewrite 0xCAFEBABE
as "\xca\xfe\xba\xbe"
and remove the pack
call, but I find that syntax ugly. Is there a way I could get this working with the 0xCAFEBABE
literal?
Upvotes: 0
Views: 274
Reputation: 59303
Try a different argument to unpack
:
>>> header = "\xca\xfe\xba\xbe"
>>> struct.unpack(">L", header)
(3405691582,)
>>> struct.unpack(">L", header)[0] == 0xcafebabe
True
According to the docs, L
stands for "unsigned long" (i.e. 4 bytes), and >
stands for big-endian (which is the format of these bytes).
Upvotes: 4
Reputation: 114038
how bout just
self.data[:4].encode("hex") == "cafebabe"
or
self.data[:4] == "CAFEBABE".decode("hex")
(note I think its only py2)
Upvotes: 1