Is intermediate representation (such as bytecodes or .net IL) still an advantage?

Question

Is intermediate representation--IR--such as Java bytecodes or .net CIL, still an advantage? Can’t we just deploy software components in source-code?

One of the arguments in favor of IR, was the portability of the software components, which avoids the need of compiling the source code for each target architecture (regarding the existence of a virtual machine for that architecture). IR offers an abstraction over each architecture specificities. In the same way and together with metadata it brings other advantages in terms of enabling security guarantees; checking safety accesses; etc.

Today, some technologies such as Node.js (with V8 engine) introduces the idea of deployable components in source code, called packages in Node.js (I am not sure if it was a seminal idea in Node.js). Source code contains the same information of IR + metadata. Moreover, using components in source code, does not prevent the runtime engine from using the same principles of a modern virtual machine such as just-in-time compilation and late-bound data types, which allows adaptive optimization and thus in theory can yield faster execution.

So, is there any advantage of deploying software components in IR over components in source-code?

jackmott · Accepted Answer

The distinctions begin to blur in some cases which I will note below. But in general:

One advantage of an IR bytecode is that is obfuscates the logic you have created. However, so does minified javascript, and to a similar degree in some cases.
Another advantage is reduced size, but minified javascript is also small, perhaps similarly so.
A third advantage would be faster JIT compilation, as the bytecode is closer to actual machine instructions that source code would be. While you can do JIT with source code, it will take more instructions and/or memory to do it. So all else equal, you should get better performance with bytecode deployment. It should be noted that rarely is all else equal, so you may not always observe this performance advantage, or it may be relatively small depending on your performance needs.
A fourth advantage would be that you can more easily have other languages target a bytecode IR than target a language. While it is possible to create a language that compiles to javascript, it is usually easier to compile down to a bytecode, and you will have more control over the performance and correctness of the result, since you are compiling down to something closer to machine code.
Lastly, it is possible, and even done in some cases, by this very website for instance, to hand tune your bytecode for performance, just as people do sometimes with assembler.

Now the distinction can get blurry. One could imagine a fairly heavy-weight bytecode format, perhaps due to the need to support a vast range of hardware, or perhaps due to poor design, which might be farther removed from machine code than say, interpreted ANSI C would be. But if we assume that a bytecode is a reasonable attempt to approximate machine instructions, and assume that "Source code" represents something as high level as C or higher, then the advantages above should hold.

Is intermediate representation (such as bytecodes or .net IL) still an advantage?

Answers (2)

Related Questions