Frerich Raabe
Frerich Raabe

Reputation: 94329

How can I simplify code generation at runtime?

I'm working on a piece of software which generates assembler code at runtime. For instance, here's a very simple function which generates assembler code for calling the GetCurrentProcess function (for the Win64 ABI):

void genGetCurrentProcess( char *codePtr, FARPROC addressForGetCurrentProcessFunction )
{
#ifdef _WIN64
  // mov rax, addressForGetCurrentProcessFunction
  *codePtr++ = 0x48
  *codePtr++ = 0xB8;
  *((FARPROC *)codePtr)++ = addressForGetCurrentProcessFunction;

  // call rax
  *codePtr++ = 0xFF;
  *codePtr++ = 0xD0;
#else
  // mov eax, addressForGetCurrentProcessfunction
  *codePtr++ = 0xB8;
  *((FARPROC *)codePtr)++ = addressForGetCurrentProcessFunction;

  // call eax
  *codePtr++ = 0xFF;
  *codePtr++ = 0xD0;
#endif
}

Usually I'd use inline assembler, but alas - this doesn't seem to be possible with the 64bit MSVC compilers anymore. While I'm at it - this code should work with MSVC6 up to MSVC10 and also MinGW. There are many more functions like genGetCurrentProcess, they all emit assembler code and many of them get function pointers to be called passed as arguments.

The annoying thing about this is that modifying this code is error-prone and we've got to take care of ABI-specific things manually (for instance, reserving 32 bytes stack space before calling functions for register spilling).

So my question is - can I simplify this code for generating assembler code at runtime? My hope was that I could somehow write the assembler code directly (possibly in an external file which is then assembled using ml/ml64) but it's not clear to me how this would work if some of the bytes in the assembled code are only known at runtime (the addressForGetcurrentProcessFunction value in the above example, for instance). Maybe it's possible to assemble some code but assign 'labels' to certain locations in the code so that I can easily modify the code at runtime and then copy it into my buffer?

Upvotes: 16

Views: 3966

Answers (4)

Ira Baxter
Ira Baxter

Reputation: 95354

The obvious thing to do is build a set of abstractions that represent the generation of the elements of the machine instructions of interest, and then compose calls to get the instructions/addressing modes you want. If you generate a wide variety of code, you can end up encoding the whole instruction set this way.

Then to generate a MOV instruction, you can write code that looks like:

ObjectCodeEmitMovRegister32ScaledRegister32OffsetRegister32(EAX,EDX,4,-LowerBound*4,ESP);

You can tell I like long names. (At least I never forget what they do.)

Here's some bits of a code generator supporting this that I implemented in C a long time ago. This covers kind of the hardest part, which is generation of MOD and SIB bytes. Following this style one can implement as much of the instruction set as one likes. This example is only for x32, so OP will have to extend and modify accordingly. The definition of the MOV instruction generator is down at the end.

#define Register32T enum Register32Type
enum Register32Type {EAX=0,ECX=1,EDX=2,EBX=3,ESP=4,EBP=5,ESI=6,EDI=7};

inline
byte ObjectCodeEmitModRM32Register32(Register32T Register32,Register32T BaseRegister32)
// Send ModRM32Bytes for register-register mode to object file
{  byte ModRM32Byte=0xC0+Register32*0x8+BaseRegister32;
   ObjectCodeEmitByte(ModRM32Byte);
   return ModRM32Byte;
}

inline
byte ObjectCodeEmitModRM32Direct(Register32T Register32)
// Send ModRM32Bytes for direct address mode to object file
{  byte ModRM32Byte=Register32*0x8+0x05;
   ObjectCodeEmitByte(ModRM32Byte);
   return ModRM32Byte;
}

inline
void ObjectCodeEmitSIB(Register32T ScaledRegister32,
           natural Scale,
           Register32T BaseRegister32)
// send SIB byte to object file
// Note: Use ESP for ScaledRegister32 to disable scaling; only useful when using ESP for BASE.
{  if (ScaledRegister32==ESP && BaseRegister32!=ESP) CompilerFault(31);
   if      (Scale==1) ObjectCodeEmitByte((byte)(0x00+ScaledRegister32*0x8+BaseRegister32));
   else if (Scale==2) ObjectCodeEmitByte((byte)(0x40+ScaledRegister32*0x8+BaseRegister32));
   else if (Scale==4) ObjectCodeEmitByte((byte)(0x80+ScaledRegister32*0x8+BaseRegister32));
   else if (Scale==8) ObjectCodeEmitByte((byte)(0xC0+ScaledRegister32*0x8+BaseRegister32));
   else CompilerFault(32);
} 

inline
byte ObjectCodeEmitModRM32OffsetRegister32(Register32T Register32,
                       integer Offset,
                       Register32T BaseRegister32)
// Send ModRM32Bytes for indexed address mode to object file
// Returns 1st byte of ModRM32 for possible use in EmittedPushRM32 peephole optimization
{ byte ModRM32Byte;
  if (Offset==0 && BaseRegister32!=EBP)
 {  ModRM32Byte=0x00+Register32*0x8+BaseRegister32;
    ObjectCodeEmitByte(ModRM32Byte);
    if (BaseRegister32==ESP) ObjectCodeEmitSIB(ESP,1,ESP);
 }
  else if (Offset>=-128 && Offset<=127)
       { ModRM32Byte=0x40+Register32*0x8+BaseRegister32;
     ObjectCodeEmitByte(ModRM32Byte);
     if (BaseRegister32==ESP) ObjectCodeEmitSIB(ESP,1,ESP);
     ObjectCodeEmitByte((byte)Offset);
       }
  else { // large offset
     ModRM32Byte=0x80+Register32*0x8+BaseRegister32;
     ObjectCodeEmitByte(ModRM32Byte);
     if (BaseRegister32==ESP) ObjectCodeEmitSIB(ESP,1,ESP);
     ObjectCodeEmitDword(Offset);
   }
  return ModRM32Byte;
}

inline
byte ObjectCodeEmitModRM32OffsetScaledRegister32(Register32T Register32,
                         integer Offset,
                         Register32T ScaledRegister32,
                         natural Scale)
// Send ModRM32Bytes for indexing by a scaled register with no base register to object file
// Returns 1st byte of ModRM32 for possible use in EmittedPushRM32 peephole optimization
{ byte ModRM32Byte=0x00+Register32*0x8+ESP;
  ObjectCodeEmitByte(ModRM32Byte); // MOD=00 --> SIB does disp32[index]
  ObjectCodeEmitSIB(ScaledRegister32,Scale,EBP);
  ObjectCodeEmitDword(Offset);
  return ModRM32Byte;
}

inline
byte ObjectCodeEmitModRM32ScaledRegister32OffsetRegister32(Register32T Register32,
                               Register32T ScaledRegister32,
                               natural Scale,
                               integer Offset,
                               Register32T BaseRegister32)
// Send ModRM32Bytes for indexed address mode to object file
// Returns 1st byte of ModRM32 for possible use in EmittedPushRM32 peephole optimization
// If Scale==0, leave scale and scaled register out of the computation
{ byte ModRM32Byte;
  if (Scale==0) ObjectCodeEmitModRM32OffsetRegister32(Register32,Offset,BaseRegister32);
  else if (Offset==0 && BaseRegister32!=EBP)
 {  ModRM32Byte=0x00+Register32*0x8+ESP;
    ObjectCodeEmitByte(ModRM32Byte);
    ObjectCodeEmitSIB(ScaledRegister32,Scale,BaseRegister32);
 }
  else if (Offset>=-128 && Offset<=127)
       { ModRM32Byte=0x40+Register32*0x8+ESP;
     ObjectCodeEmitByte(ModRM32Byte);
     ObjectCodeEmitSIB(ScaledRegister32,Scale,BaseRegister32);
     ObjectCodeEmitByte((byte)Offset);
       }
  else { // large offset
     ModRM32Byte=0x80+Register32*0x8+ESP;
     ObjectCodeEmitByte(ModRM32Byte);
     ObjectCodeEmitSIB(ScaledRegister32,Scale,BaseRegister32);
     ObjectCodeEmitDword(Offset);
   }
  return ModRM32Byte;
}

inline
void ObjectCodeEmitLeaRegister32OffsetRegister32ScaledPlusBase32(
               Register32T Register32Destination,
                           integer Offset,
                           Register32T Register32Source,
               natural Scale, // 1,2,4 or 8
               Register32T Base)
// send "LEA Register32,offset[Register32*Scale+Base]" to object file
{ ObjectCodeEmitLeaOpcode();
  ObjectCodeEmitModRM32ScaledRegister32OffsetRegister32(
    Register32Destination,Register32Source,Scale,Offset,Base);
}

inline
void ObjectCodeEmitMovRegister32ScaledRegister32OffsetRegister32(Register32T DestinationRegister32,
                               Register32T ScaledRegister32,
                               natural Scale,
                               integer Offset,
                               Register32T BaseRegister32)
// Emit Mov R32 using scaled index addressing
{  ObjectCodeEmitMovRegister32Opcode();
   ObjectCodeEmitModRM32ScaledRegister32OffsetRegister32(DestinationRegister32,
                             ScaledRegister32,
                             Scale,
                             Offset,
                             BaseRegister32);
}

Upvotes: 0

Tam&#225;s Szelei
Tam&#225;s Szelei

Reputation: 23921

Take a look at asmjit. It is a C++ library for runtime code-generation. Supports x64 and probably most of the existing extensions (FPU, MMX, 3dNow, SSE, SSE2, SSE3, SSE4). Its interface resembles assembly syntax and it encodes the instructions correctly for you.

Upvotes: 11

snemarch
snemarch

Reputation: 5008

You could depend on a real assembler to do the work for you - one that generates binary output is obviously the best. Consider looking at yasm or fasm (there's some posts on the fasm forums about doing a DLL version, so you don't have to write a temporary assembly file, launch external process, and read output file back, but I dunno if it's been updated for later versions).

This might be overkill if your needs are relatively simple, though. I'd consider doing a C++ Assembler class supporting just the mnemonics you need, along with some helper functions like GeneratePrologue, GenerateEpilogue, InstructionPointerRelativeAddress and such. This would allow you to write pseudo-assembly, and having the helper functions take care of 32/64bit issues.

Upvotes: 2

Alexey Frunze
Alexey Frunze

Reputation: 62048

You could abstract away some instruction encoding, calling convention and CPU-mode-related details by writing some helper functions and macros.

You can even create a small assembler that would assemble pseudo-asm-code numerically encoded and contained in an array into runnable code, e.g. starting with input like this:

UINT32 blah[] =
{
  mov_, ebx_, dwordPtr_, edi_, plus_, eax_, times8_, plus_, const_, 0xFEDCBA98,
  call_, dwordPtr_, ebx_,
};

But it's a lot of work to get this done and done right. For something simpler, just create helper functions/macros, essentially doing what you have already done, but hiding some nasty details from the user.

Upvotes: 0

Related Questions