BPL
BPL

Reputation: 9863

how to read ctypes structures from file

I'm trying to learn more about PE files structures and also ctypes structures. I got a couple of questions:

Given this simple declaration:

from ctypes import *

class ImageDosHeader(Structure):
    _fields_ = [
        ('e_magic', c_uint8, 2),
        ('e_cblp', c_uint8, 2),
        ('e_cp', c_uint8, 2),
        ('e_crlc', c_uint8, 2),
        ('e_cparhdr', c_uint8, 2),
        ('e_minalloc', c_uint8, 2),
        ('e_maxalloc', c_uint8, 2),
        ('e_ss', c_uint8, 2),
        ('e_sp', c_uint8, 2),
        ('e_csum', c_uint8, 2),
        ('e_ip', c_uint8, 2),
        ('e_cs', c_uint8, 2),
        ('e_lfarlc', c_uint8, 2),
        ('e_ovno', c_uint8, 2),
        ('e_res', c_uint8, 2),
        ('e_oemid', c_uint8, 2),
        ('e_oeminfo', c_uint8, 2),
        ('e_res2', c_uint8, 2 * 10),
        ('e_lfanew', c_uint8, 4),
    ]

Question1: What's the reason to get ValueError: number of bits invalid for bit field?

If I try to read a simplified structure like this:

class ImageDosHeader(Structure):
    _fields_ = [
        ('e_magic', c_uint8, 2),
    ]


filename = 'example1_crinkler.exe'
with open(filename, 'rb') as f:
    record = ImageDosHeader()
    f.readinto(record)
    print(record.e_magic)

print('-' * 80)
with open(filename, 'rb') as f:
    content = f.read()
    print(binascii.hexlify(content))
print(len(content))

I'm getting this output:

1
--------------------------------------------------------------------------------
b'4d5a3230504500004c01000001db617f10d017737547ebf9080002000b0111c94585c0791f01d350f7e2903d5c000000f7f339c119dbeb480000400004000000040000000fa32d320140008d0400ebce00000000ebb64206400000005331edbb0300000090be1c0140006a0158bf00004200b1009057eb1200000000000000005a72079229d1040029d060ad01f8742c6a0a5a89145489542410ad31ed4d4501c072fb74af60ac88c232076bc06f028700000000484f00d272ef75f9bf16104200b923e91f03730cf366ab0a06618d760e7bb7c3f7f18d3c5789e931c0ae74040007750241410fb61407d3e201548434487af385db7f0dd02c1f7503d0141ff7d3fe041f6146eb95e86aa6384b0b237bc66f82e9197a11121b13094ff4efbdfffeff7d96008180c090caeaefbdffffffffff44bec48b813eee1868379da18dcb827772cd7a49fe835cc82cc3850cb06603a8fff8e2d9becd3629b3268d3d6cb5ce9fd714516822c6dcc349482dc1e6868f92ed3df97e79469ad174b79b0479d9afb758dd7dd85c9218edef5ca0b47d5dd8aa46494ef4242201baeb2853c31d6f5ac630668462c5543f23a23616db595d1cd08993fce7231e8716b5f79480a1cacd10498fe4864843e744ae3e124c00e74af0173c8346860d98fb9882688c896737e07d3f16ee0d08238b0f0ed89553dbe2b177b64249ca712b9051e88c73'
510

Question2: why print(record.e_magic) is it printing 1? I'd expect it to be "MZ", or 0x4d5a, or 19802... but not 1, what am i missing here?

NS: I'm aware of pefile pypi package or similars, that package is awesome but I'd like to reinvent the wheel here using ctypes to learn also about it, just for the record :)

Upvotes: 0

Views: 2473

Answers (1)

Yann Vernier
Yann Vernier

Reputation: 15887

Both questions seem to indicate the same misunderstanding. Quoting from the ctypes documentation for Structure._fields_:

For integer type fields like c_int, a third optional item can be given. It must be a small positive integer defining the bit width of the field.

This means the third field is the number of bits, not the number of values. Thus you've instructed that e_magic is a 2 bit wide field within an uint8. The ValueError stemmed from trying to extract 20 bits out of an 8 bit value in e_res2. The 1 value came from the lowest two bits of 'M': ord('M')&0b11 == 1.

You can adjust your data types by either selecting better matching ones, like c_uint16 for the 16-bit fields, or creating other structure types, like 2*c_char. Avoid using the third entry unless you're really dealing with a bit field; ctypes already knows the size of its types.

Upvotes: 3

Related Questions