Darrin Woolit
Darrin Woolit

Reputation: 69

PE 101 explanation of addresses to windows api calls

I am trying to build a program that will give more information about a file and possibly a disassembler. I looked at https://code.google.com/p/corkami/wiki/PE101 to get more information and after reading it a few times I am understanding most of it. the part I don't understand is the call addresses to windows api. for example how did he know that the instruction call [0x402070] was an api call to messagebox? I understand how to count the addresses to the strings and the 2 push commands to strings make sense, but not the dll part.

I guess what I am trying to say is I don't understand the part that says "imports structures" (the part I drew a box around in yellow) If any one could please explain to me how 0x402068 points to exitProcess and 0x402070 points to MessageBoxA, this would really help me. thanks enter image description here

Upvotes: 2

Views: 809

Answers (1)

MKaama
MKaama

Reputation: 1919

Loader (a part of Windows OS) "patches up" the Import Address Table (IAT) before starting the sample program, that's when the real addresses of the library procedures appear in the memory locations 0x402068 and 0x402068. Please note that imports reside in section nobits in simple.asm:

section nobits vstart=IMAGEBASE + 2 * SECTIONALIGN align=FILEALIGN

The section with imports after load starts at virtual address (IMAGEBASE=400000h)+2*(SECTIONALIGN=1000h)=0x402000 .

The yasm source of the example is quite unusual and the diagram is also not the best place to learn PE format from. Please start by reading Wikipedia:Portable_Executable first (a short article). It has links to the full documents, so I will only make some short notes here.

You might also want to use the Cheat Engine to inspect the sample. Launch simple.exe, then attach to the process with Cheat Engine, press Memory View, then menu Tools->Dissect PE headers, then button Info, look at tab Imports. In the memory dump, go to address 00402000 (CTRL+G 00402000 Enter:

00402068: E4 39 BE 75 00 00 00 00 69 5F 47 77 00 00 00 00 6B 65 72 6E 65 6C 33 32 2E

Note the values at these locations

  • 00402068: 0x75BE39E4 (on my computer) = the address of KERNEL32.ExitProcess
  • 00402070: 0x77475F69 (in my case only) = the address of user32.MessageBoxA

Notice the text "kernel32.dll user32.dll" right after them. Now look at the hexdump of simple.exe (I would use Far Manager) and spot the same location before strings "kernel32.dll user32.dll". The values there are

0000000450: 69 74 50 72 6F 63 65 73 │ 73 00 00 00 4D 65 73 73  itProcess   Mess
0000000460: 61 67 65 42 6F 78 41 00 │ 4C_20_00_00 00 00 00 00  ageBoxA L
0000000470: 5A_20_00_00 00 00 00 00 │ 6B 65 72 6E 65 6C 33 32  Z       kernel32
0000000480: 2E 64 6C 6C 00 75 73 65 │ 72 33 32 2E 64 6C 6C 00  .dll user32.dll
  • 0000000468: 0x0000204C — the Relative Virtual Address of dw 0;db 'ExitProcess', 0
  • 0000000470: 0x0000205A — the Relative Virtual Address of dw 0;db 'MessageBoxA', 0

The loader has changed these values from what they were in the file after loading into memory. The Microsoft document pecoff.doc says about it:

6.4.4. Import Address Table The structure and content of the Import Address Table are identical to that of the Import Lookup Table, until the file is bound. During binding, the entries in the Import Address Table are overwritten with the 32-bit (or 64-bit for PE32+) addresses of the symbols being imported: these addresses are the actual memory addresses of the symbols themselves (although technically, they are still called “virtual addresses”). The processing of binding is typically performed by the loader.

Upvotes: 2

Related Questions