Steve H
Steve H

Reputation: 493

Python read() api anomaly

Hi here is a snippet from my python script:

#seek to the symtab_offset    
elf_fp.seek(self.symtab_sh.sh_offset)

#each entry is 16 bytes, so num_entries = size/16
num_entries = self.symtab_sh.sh_size/16
symbol_list = []
counter = 0
prev=0
for _ in range(num_entries):
    counter+=1
    s = struct.Struct('IIIccH' )
    prev = elf_fp.tell()
    print str(counter) +"  " +str(elf_fp.tell()) +"/" + str(hex(elf_fp.tell())),
    buffer = elf_fp.read(16)
    print " diff: " +str(elf_fp.tell() - prev)
    if len(buffer) !=16:
        continue
    unpacked_data = s.unpack(buffer)
    name          = unpacked_data[0]
    value         = unpacked_data[1]
    size          = unpacked_data[2]
    types         = unpacked_data[3]
    #print str(size) +"," +str(types.encode('hex'))
    #only add none zero size entries
    if size and name:
       symbol_list.append({"name":name,"value":value, "size": size, "type": types})

This snippet is reading 16 bytes of data from and ELF file's symbol table and trying to unpack it within a struct format. The problem I am facing is that in a big ELF file with more than 100+ symbols I could successfully decipher symbol information for first 100 symbols but last few i can't.

If I look at my log I can see that read api is acting weird. After reading 16 bytes from file it should increment file pointer by 16 bytes. Instead I can see it incrementing it by some weird offsets at some places.

Here is log snippet:

107  36056/0x8cd8L  diff: 16
108  36072/0x8ce8L  diff: 16
109  36088/0x8cf8L  diff: 16
110  36104/0x8d08L  diff: 2864
111  38968/0x9838L  diff: 16

You can see that for 110th symbol the read is causing a jump of around 2864 bytes. Any idea why read is behaving this weird? Are there known problems with python read api?

Upvotes: 0

Views: 175

Answers (1)

Robᵩ
Robᵩ

Reputation: 168616

You've opened the file in 'r' mode, or text mode. In order for file.tell() to provide useful information, you must open the file in 'rb' or binary mode.

Upvotes: 3

Related Questions