Reputation: 91
I played with Capstone disassembler and found strange behaviour.
I created a simple program, which takes notepad.exe
(x86-64 PE), disassembles its .text
section and prints the disassembly line by line. (Slightly modified version of https://stackoverflow.com/a/66140741).
Problem: It looks like the disassembly interrupts immediately after 0x1c00
, starts from the beginning of .text
section, and ends 0x400
bytes before it should.
Note 1: Capstone version: 5.0.1.
Note 2: The file is not corrupted.
Note 3: pefile loads the file correctly.
Note 4: Same behaviour when running the example on Windows.
Note 5: Same behaviour on other files.
Is this a bug or am I doing something wrong?
Code:
import pefile
from capstone import *
exe_file = '/home/user/TEST/notepad.exe'
pe = pefile.PE(exe_file)
# find .text section
offset = False
for section in pe.sections:
if section.Name == b'.text\x00\x00\x00':
offset = section.VirtualAddress
code_ptr = section.PointerToRawData
code_end_ptr = code_ptr + section.SizeOfRawData
print("@@@ offset=0x{:0x} code_ptr=0x{:0x} code_end_ptr=0x{:0x}".format(offset, code_ptr, code_end_ptr))
break
code = pe.get_memory_mapped_image()[code_ptr : code_end_ptr]
# start disassembling text section
md = Cs(CS_ARCH_X86, CS_MODE_64)
md.detail = True
if offset:
for i in md.disasm(code, offset):
print(i)
print("end")
Output:
@@@ offset=0x1000 code_ptr=0x400 code_end_ptr=0x24a00
<CsInsn 0x1000 [cc]: int3 >
<CsInsn 0x1001 [cc]: int3 >
<CsInsn 0x1002 [cc]: int3 >
<CsInsn 0x1003 [cc]: int3 >
<CsInsn 0x1004 [cc]: int3 >
<CsInsn 0x1005 [cc]: int3 >
<CsInsn 0x1006 [cc]: int3 >
<CsInsn 0x1007 [cc]: int3 >
<CsInsn 0x1008 [4c8bdc]: mov r11, rsp>
<CsInsn 0x100b [4881ec88000000]: sub rsp, 0x88>
...
<CsInsn 0x1bf4 [e847fdffff]: call 0x1940>
<CsInsn 0x1bf9 [eb0c]: jmp 0x1c07>
<CsInsn 0x1bfb [4c8d05d659cccc]: lea r8, [rip - 0x3333a62a]> <-- disassembly interrups immediately after 0x1c00,
<CsInsn 0x1c02 [cc]: int3 > <-- starts from the beginning of .text,
<CsInsn 0x1c03 [cc]: int3 >
<CsInsn 0x1c04 [cc]: int3 >
<CsInsn 0x1c05 [cc]: int3 >
<CsInsn 0x1c06 [cc]: int3 >
<CsInsn 0x1c07 [cc]: int3 >
<CsInsn 0x1c08 [4c8bdc]: mov r11, rsp>
<CsInsn 0x1c0b [4881ec88000000]: sub rsp, 0x88>
...
<CsInsn 0x255ee [cc]: int3 >
<CsInsn 0x255ef [cc]: int3 >
<CsInsn 0x255f0 [4883790800]: cmp qword ptr [rcx + 8], 0>
<CsInsn 0x255f5 [488d05d4290000]: lea rax, [rip + 0x29d4]> <-- and ends 0x400 bytes before it should
end
IDA Pro disassembly (for reference):
Upvotes: 3
Views: 125
Reputation: 21
"By default, Capstone stops disassembling when it encounters a broken instruction."
trying turn on SKIPDATA mode
import pefile
from capstone import *
exe_file = '/home/user/TEST/notepad.exe'
pe = pefile.PE(exe_file)
# find .text section
offset = False
for section in pe.sections:
if section.Name == b'.text\x00\x00\x00':
offset = section.VirtualAddress
code_ptr = section.PointerToRawData
code_end_ptr = code_ptr + section.SizeOfRawData
print("@@@ offset=0x{:0x} code_ptr=0x{:0x} code_end_ptr=0x{:0x}".format(offset, code_ptr, code_end_ptr))
break
code = pe.get_memory_mapped_image()[code_ptr : code_end_ptr]
# start disassembling text section
md = Cs(CS_ARCH_X86, CS_MODE_64)
md.detail = True
md.skipdata = True # turn on skipdata mode
if offset:
for i in md.disasm(code, offset):
print(i)
print("end")
Upvotes: 2