Reputation: 475
As noted in https://docs.oracle.com/javase/specs/jvms/se14/html/jvms-2.html#jvms-2.11.1, encoding operand types into opcodes comes at a cost:
Given the Java Virtual Machine's one-byte opcode size, encoding types into opcodes places pressure on the design of its instruction set. If each typed instruction supported all of the Java Virtual Machine's run-time data types, there would be more instructions than could be represented in a byte.
Thus, it seems one should only do this for instructions where type information of the operands is required or enables optimizations. For example, differentiation between iadd
and fadd
is required because addition for integers and floats is implemented differently. And I don't know exactly why there are different instructions for loading a boolean
and an int
from an array (baload
and iaload
, respectively), but I can at least imagine some performance reasons.
However, why are there different instructions for storing an int
(istore
) and a float
(fstore
) into a local variable? Shouldn't they be implemented in absolutely the same way?
This answer https://stackoverflow.com/a/2638143 says typed instructions are required for the bytecode verifier. But is this really necessary? In a method, all data flows from the method's parameters (for which the types are known) and from class fields (for which the types are also known) to other class fields and to the return value. Thus, since the types for inputs and outputs are known, can't we reconstruct any missing types for the instructions? In fact, isn't this what the bytecode verifier does anyway, since it has to check the types, i.e., it must know which types are expected?
In short: What would break if we were to combine istore
and fstore
into a single instruction? Would performance or portability suffer? Would bytecode verification stop working?
Upvotes: 2
Views: 308
Reputation: 98284
istore
and fstore
are implemented differently on pretty much every JVM and every architecture I used to work with.
For example, in HotSpot JVM x64 interpreter, istore_0
is implemented as
mov dword ptr [r14], eax
whereas fstore_0
is implemented as
movss dword ptr [r14], xmm0
The interpreter caches top-of-stack value in a register, and there are different registers for integers and floating point values.
Similarly, baload
and iaload
are implemented differently, as they use different offset multiplier (1 and 4 respectively), and require different machine instructions to load 8-bit value vs. 32-bit value.
As you noticed, some type information can be derived from the data flow analysis, which the bytecode verifier does anyway. But in order to use this information in runtime, the stack slots and local variables would need to be somehow tagged with the corresponding type, and the merged bytecode instructions would need to read this tag in runtime and dispatch depending on the tag. Of course, this would be of suboptimal, both in runtime performance and used memory.
Upvotes: 7
Reputation: 39451
I think you are correct about loads and stores not needing to be typed. Not being one of the original designers of Java myself, I can only speculate about why things were designed that way. But here's my guess.
I think that when Java was first designed, it was designed with interpretation in mind, and it's possible that they thought that loads and stores would need to be implemented differently for ints and floats. It's also possible they wanted to be able to run without verification (in fact, it is still possible to disable bytecode verification today). Lastly, while it is technically possible to still do the same bytecode verification when load and store instructions are untyped, it would make things slightly more complicated.
A simple proof that verification could be done with untyped loads and stores is that the local variable table behaves similarly to the operand stack, and there are untyped instructions that operate on the operand stack (dup
, swap
, etc.).
Upvotes: 2