Reputation: 33405
When designing a byte code interpreter, is there a consensus these days on whether stack or three address format (or something else?) is better? I'm looking at these considerations:
The objective language is a dynamic language fairly similar to Javascript.
Performance is important, but development speed and portability are more so for the moment.
Therefore the implementation will be strictly an interpreter for the time being; a JIT compiler may come later, resources permitting.
The interpreter will be written in C.
Upvotes: 9
Views: 1927
Reputation: 202535
Experiments done by David Gregg and Roberto Ierusalimschy have shown that a register-based bytecode works better than a stack-based bytecode because fewer bytecode instructions (and therefore less decoding overhead) are required to do the same tasks. So three-address format is a clear winner.
Upvotes: 6
Reputation: 9715
Take a look at the OCaml bytecode interpreter - it's one of the fastest of its kind. It is pretty much a stack machine, translated into a threaded code on loading (using the GNU computed goto extension). You can generate a Forth-like threaded code as well, should be relatively easy to do.
But if you're keeping a future JIT compilation in mind, make sure that your stack machine is not really a full-featured stack machine, but an expression tree serialisation form instead (like .NET CLI) - this way you'd be able to translate your "stack" bytecode into a 3-address form and then into an SSA.
Upvotes: 0
Reputation: 19349
I don't have much (not really any) experience in this area, so you might want to verify some of the following for yourself (or maybe someone else can correct me where necessary?).
The two languages I work with most nowadays are C# and Java, so I am naturally inclined to their methodologies. As most people know, both are compiled to byte code, and both platforms (the CLR and the JVM) utilize JIT (at least in the mainstream implementations). Also, I would guess that the jitters for each platform are written in C/C++, but I really don't know for sure.
All-in-all, these languages and their respective platforms are pretty similar to your situation (aside from the dynamic part, but I'm not sure if this matters). Also, since they are such mainstream languages, I'm sure their implementations can serve as a pretty good guide for your design.
With that out of the way, I know for sure that both the CLR and the JVM are stack-based architectures. Some of the advantages which I remember for stack-based vs register-based are
Also, I find stack-based to be a little more intuitive and readable, but that's a subjective thing, and like I said before, I haven't seen too much byte code yet.
Some advantages of the register-based architecture are
Of course, there are always ways to offset the disadvantages for each, but I think these describe the obvious things to consider.
Upvotes: 1
Reputation: 72312
Read The evolution of Lua and The implementation of Lua 5.0 for how Lua changed from a stack-based virtual machine to a register-based virtual machine and why it gained performance doing it.
Upvotes: 8
Reputation: 27460
If you have JIT in your mind then bytecodes is the only option.
Just in case you can take a look on my TIScript: http://www.codeproject.com/KB/recipes/TIScript.aspx and sources: http://code.google.com/p/tiscript/
Upvotes: -1