Reputation: 79458
I understand the gist of how JIT compilation works (after reading such resources as this SO question). However, I am still wondering how does it actually execute the machine code at runtime?
I don't have a deep background in operating systems or compiler optimizations, and haven't done anything with machine code directly, but am starting to explore it. I have started playing around in assembly, and see how something like NASM can take your assembly code and compile it to machine code (the executable), and then you can "invoke" it from the command line like ./my-executable
.
But how is a JIT compiler actually doing that at runtime? Is it like streaming machine code into stdin or something, or how does it work? If you could provide an example or some pseudocode of how some assembly (or something along those lines, not as high level as C though) might look to demonstrate the basic flow, that would be amazing too.
Upvotes: 8
Views: 1956
Reputation: 186
I'll try to ellaborate more on @MooingDuck answer. Let's take a c# example of hello world code.
namespace Hello
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Hello, world!");
}
}
}
The equivalent assembly code is something like:
mov edx,len ;message length
mov ecx,msg ;message to write
mov ebx,1 ;file descriptor (stdout)
mov eax,4 ;system call number (sys_write)
int 0x80 ;call kernel
mov eax,1 ;system call number (sys_exit)
int 0x80 ;call kernel
msg db 'Hello, world!',0xa ;our dear string
len equ $ - msg ;length of our dear string
(This code was taken from here).
Each of these instructions, and obiously the data itself, can be represented as numbers. Now, I can just put those numbers inside a buffer, tell the CPU to get to the buffer's position in memory and start executing the code. right?
Not so fast.
As you can see in this SO question, it doesn't work, until you map the memory as executable. Now you can cast is as a function, and "call" this memory. it will run.
To summarize, as far as I understand, this is more or less how the JITTER works:
Upvotes: 0
Reputation: 66981
You mentioned that you played around with assembly so you have some idea how that works, good. Imagine that you write code that allocates a buffer (ex: at address 0x75612d39). Then your code saves the assembly ops to that buffer to pop a number from the stack, the assembly to call a print function to print that number, then the assembly to "return". Then you push the number 3 onto the stack, and call/jump to address 0x75612d39. The processor will obey the instructions to print your numbers, then return to your code again, and continue. At the assembly level it's actually pretty straightforward.
I don't know any "real" assembly languages, but here's a "sample" cobbled together from a bytecode I know. This machine has 2 byte pointers, the string %s
is located at address 6a
, and the function printf
is located at address 1388
.
void myfunc(int a) {
printf("%s", a);
}
The assembly for this function would look like this:
OP Params OpName Description
13 82 6a PushString 82 means string, 6a is the address of "%s"
So this function pushes a pointer to "%s" on the stack.
13 83 00 PushInt 83 means integer, 00 means the one on the top of the stack.
So this function gets the integer at the top of the stack,
And pushes it on the stack again
17 13 88 Call 1388 is printf, so this calls the printf function
03 02 Pop This pops the two things we pushed back off the stack
02 Return This returns to the calling code.
So when your JITTER reads in the void myfunc(int a) {printf("%s", a);}
, it allocate memory for this function (ex: at address 0x75612d39), and store these bytes in that memory: 13 82 6a 13 83 00 17 13 88 03 02 02
. Then, to call that function, it simply jumps/calls the function at address 0x75612d39.
Upvotes: 9
Reputation: 9724
When code is executed, it all boils down to the code being loaded into a known part of memory, and the program counter being set to the start of the code, either by a direct register setting, or a jmp instruction, or similar. So what the JIT compiler will do is build the machine code in a known part of memory, and then execute from there.
Upvotes: 2