Reputation: 21

Assembly Determining the size of the code segment

Given a portion of the assembly listing that contains a code segment, how would one go about determining the size of the code segment?

Upvotes: 2

Answers (3)

Olof Forshell

Reputation: 3284

If your list is the assembler output from a compiler you will have the relative starting address of every instruction in one of the left-most columns followed by instruction hex codes and the instruction mnemonics. Interspersed between the rows of assembly instructions you will find the source code line that generated the instructions following it (unless you have high optimization - then it will be difficult to follow).

Some listings will give addresses relative to the start of the module and other relative to the start of each function. It's simply a matter of subtracting the first address from the last and adding to that the number of bytes in the instruction hex code of the last line.

Upvotes: 0

FrankH.

Reputation: 18247

Predicting this, pre-assembly-stage, at least for x86 / x64 assembly, is unfortunately impossible in the general case because the instruction set contains ambiguities. I.e. there are multiple possible machine codes (with different sizes) for the same assembly instruction. Only the assembler itself knows what binary opcode it'll finally choose.

That said, of course it's normal and desirable to find the size of a piece of code; most assemblers simply do this by letting you take the difference between two labels within the code, like (GNU assembler, i.e. AT&T / UN*X style):

somefunc:
    pushq   %rbp
    movq    %rsp, %rbp
    movl    $(.Lfuncend - somefunc), %eax
    leave
    ret
.Lfuncend:

When you run this through the assembler and disassemble the output again, you see it inlines the $(.Lfuncend - somefunc) bit as a constant generated by the assembler:

$ objdump -d tst.o

tst.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 :
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   b8 0b 00 00 00          mov    $0xb,%eax
   9:   c9                      leaveq 
   a:   c3                      retq

This function returns its own size, and as you can see from the offsets / binary opcodes shown, 0xb / 11 is correct.

Upvotes: 1

Carl Norum

Reputation: 225272

If you want to figure this out manually, you can go grab the Intel® 64 and IA-32 Architectures Software Developer's Manuals, and go through your source snippet mnemonic by mnemonic and figure out the expected size of the assembled code. In the case of the Intel architectures, you might end up with slightly different answers than your assembler gives you, since there are some ambiguous opcodes in most assemblers - int $3 comes to mind.

The better (and probably more accurate) way is to just assemble your snippet and check how big the resulting section is in the output file.

Upvotes: 1

Assembly Determining the size of the code segment

Answers (3)

Related Questions