Implementing bound function parameters in a compiler

Question

I have an idea for a functional programming language design that makes heavy use of bound function parameters. I'm trying to express bound function parameters in x86 assembly as part of the compiler implementation.

var add  = function(x,y) { return x + y; };
var add2 = add.bind({}, 2);
console.log( add2(3) );      // prints 5

For interoperability reasons i'd like to produce bare function pointers, so my first concept idea is to allocate some executable memory on the heap, and copy in a stub that (a) pushes an extra parameter and (b) calls the target function. This would be a part of the standard library and would return a native function pointer that i can use from the rest of the x86 assembly program.

I think i've come across a problem with this approach - if the stub uses call to get to the target function, then the stack contains a return address which ends up being interpreted as a function argument! And if the stub uses jmp to get to the target function, then neither the caller nor callee know exactly how to clean up the stack when the function returns.

How can this be resolved? I suppose a register could be permanently reserved as a flag for this behaviour but that's hardly elegant.

How is bind() implemented in native compilers for functional languages?

mappu · Accepted Answer

On some further thought, i think this can be done by using a callee-cleanup convention and manually managing all my return addresses. It's similar to stdcall, but not identical, since call/ret can't be used(?).

Pseudocode:

main:
   ; create stub, copy in 0x02 and &add, the resulting function pointer goes in add2
   local add2 = _create_trampoline

   ; make the call
   push [return address]
   push 0x03 ;arg1
   jmp  add2

; the resulting stub, held on the heap somewhere
add2:
   push 0x02 ;bound argument
   jmp  add

; var add(x,y)
add:
   local x = pop
   local y = pop

   eax = x + y;       

   jmp pop

This way, add knows the stack is laid out as y x [ptr] and execution returns correctly.

It seems a little drastic to lose call/ret over this, and the stack frame for function add is pretty tenuous, so i'll leave the question open for at least another 24 hours in hope of a better solution.

EDIT: On some further thought, you can keep cdecl, caller-cleanup, call/ret and everything by simply carrying along the return address in the bound trampoline (which only requires clobbering one register, or moving it to the stack and back).

Pseudocode:

main:
   ; create stub, copy in 0x02 and &add, the resulting function pointer goes in add2
   local add2 = _magic(0x02, &add);

   ; make the call
   push 0x03;
   call add2;

add2:
   ebx = pop;     ;the return address goes in a temporary
   push 0x02;
   push ebx;
   jmp add

; var add(x,y)
add:
   push ebp;
   mov ebp, esp;

   ; local variables are [ebp+8] and [ebp+12]
   perform calculation into eax

   leave
   ret

There, the result is a quite concise technique to implement bound function parameters as executable objects on the heap, maintaining cdecl calling conventions. There will undoubtedly be problems with this approach when implementing it, but i expect it is workable and not too horribly inefficient.

Implementing bound function parameters in a compiler

Answers (2)

Related Questions