Abhisheyk Deb
Abhisheyk Deb

Reputation: 35

How can I convert assembly code into binary code?

I created a simple c++ source file with the following code:

    int main() {
    int a = 1;
    int b = 2;
    if(a < b) {
        return 1;
    }
    else if(a > b) {
        return 2;
    }
    else {
        return 3;
    }
}

I used the objdump command to get the assembly code for the above source code.
The line int b = 2; got converted into mov DWORD PTR [rbp-0x4], 0x2.
Its corresponding machine code is C7 45 FC 02 00 00 00 (hex format).

I would like to know how I can convert assembly code into binary code. I went through the Intel Reference Manual for x86-64, but I was not able to understand it, since I am new to low level programming.

Upvotes: 1

Views: 5177

Answers (1)

fuz
fuz

Reputation: 93127

You should read the Intel manuals, it explains how to do that. For a simpler reference, read this. The way x86 instructions are encoded is fairly straightforward, but the number of possibilities can be a bit overwhelming.

In a nutshell, an x86 instruction comprises the following parts, where every part except the opcode may be missing:

prefix opcode operands immediate

The prefix field may modify the behaviour of the instruction, which doesn't apply to your use case. You can look up the opcode in a reference (I like this one), for example, mov r/m32, imm32 is C7 /0 which means: The opcode is C7 and one of the two operands is zero, encoding an extended opcode. The instruction thus has the form

C7 /0 imm32

The operand/extended opcode is encoded as a modr/m byte with an optional sib (scale index base) byte for some addressing modes and an optional 8 bit or 32 bit displacement. You can look up what value you need in the reference. So in your case, you want to encode a memory operand [rbp] with a one byte displacement and a register operand of 0, leading to the modr/m byte 45. So the encoding is:

C7 45 disp8 imm32

Now we encode the 8 bit displacement in two's complement. -4 corresponds to FC, so this is

C7 45 FC imm32

Lastly, we encode the 32 bit immediate, which you want to be 2. Note that it is in little endian:

C7 45 FC 02 00 00 00

And that's how the instruction is encoded.

Upvotes: 6

Related Questions