Reputation: 13936
I've been trying to learn what exactly jump tables are, and I'm having trouble understanding something. From the many examples I've seen they seem to pretty much boil down to this, or at least this is one version of it:
void func1() {};
void func2() {};
void func3() {};
int main()
{
void(*jumpTo[3])(void) = { func1, func2, func3 };
jumpTo[1]();
return 0;
}
These just appear to be an array of function pointers, that are indexed by a certain value/position. Is it the case then that a jump table is just indexing an array of function pointers? I'm really curious about this because I've seen a lot of people saying that switch statements are often compiled into jump tables as a performance measures. From my understanding, by jumping to a function in this way it involves a pointer dereference and a function call. I thought that both of these weren't that great for performance.
Another answer on this site said that by doing it this way "you are adding a function call overhead that a switch statement doesn't necessarily have." How would a switch that compiles to a jump table avoid function calls?
Also, a highly voted answer here said "A jump table can be either an array of pointers to functions OR an array of machine code jump instructions." How would you jump to machine code instructions instead of dereferencing a pointer? Is this faster?
Is the difference between the two that in my above example the pointer doesn't have to be dereferenced because it can be statically bound? As opposed to passing in a random number as index at runtime?
Thanks.
Upvotes: 1
Views: 1396
Reputation: 1272
A jump table and your function table are basically the same - an array of addresses. A jump table contains addresses of goto
- targets.
The only difference between both is how the jump is made. When a function is called the return address is pushed on the stack, so when the function terminates it can return.
Here an example of a jump table:
#include <stdio.h>
int main(int argc, char *argv[])
{
switch (argc)
{
case 1:
printf("You provided no arguments.");
break;
case 2:
printf("You provided one argument.");
break;
case 3:
printf("You provided two arguments.");
break;
case 4:
printf("You provided three arguments.");
break;
case 5:
printf("You provided four arguments.");
break;
case 6:
printf("You provided five arguments.");
break;
default:
printf("You provided %d arguments.", argc-1);
break;
}
return 0;
}
This compiles to:
cmp edi, 6 ;Bounds check
ja .L2 ;jump to default branch
mov eax, edi
jmp [QWORD PTR .L4[0+rax*8]]
.L4:
.quad .L2 ;case 0 (same as default!!!)
.quad .L3 ;case 1
.quad .L5 ;case 2
.quad .L6 ;case 3
.quad .L7 ;case 4
.quad .L8 ;case 5
.quad .L9 ;case 6
Upvotes: 3
Reputation: 133929
Generally the term jump table refers to a technique wherein there are more than 2 branches/jump targets, and the jump/branch target is chosen by a variable by calculating the position in the table, one way or the another. In essence, the example you provided:
void(*jumpTo[3])(void) = { func1, func2, func3 };
jumpTo[1]();
as a whole is the use of a jump table - not just the dereference of the function pointer.
C offers other mechanisms too - for example a switch-case is often compiled into a jump table, especially if the case values have a narrow range, and have few gaps in between. Another mechanism provided by GCC as a non-standard extension is the use of goto
labels as pointer values with computed goto
.
Upvotes: 1
Reputation: 33719
The main difference is that for a jump table, you can typically use addressing relative to the program counter, so that the table will not need any relocations and can live in the .text
section (or some other section which is non-writable and shared). This is because a typical jump table is used only at very few places within the same object files, and all the offsets are known to the assembler.
If you have an array of function pointers instead, then you somehow need to produce real pointers, and that needs some form of relocation.
The second possibility, the array of jump instructions, is not really restricted to jump instructions per se. The important part is that all the target instruction sequences (except the last one) are of the same length, so that the offset to jump to can be computed easily. This way, no jump table it is needed, but it does need exact instruction width (and count) information, which is difficult to guarantee on most targets (RISC architectures can have difficult-to-predict effective instruction counts when it comes to loading constants). This means that in practice, this approach is restricted to a very specific form of jump instruction for the targets.
Upvotes: 1