Reputation: 1607
According to this article http://slurp.doc.ic.ac.uk/pubs/observing/linking.html#assignment:
Due to the differences in information between Java code and bytecode (bytecode does not contain the types of local variables), the verifier does not need to check subtypes for assignments to local variables, or to parameters.
My question: Why does the bytecode not contain type information for local variables, whilst it does indeed contain type information for the parameters and return value?
Upvotes: 11
Views: 2143
Reputation: 39451
First off, there are several different notions of type. There are the compile time types, which include generics. However, generics don't exist after compile time.
There is the verification inferred static type of a variable, which can be int, float, long, double, returnaddress, or an object reference. Object references are additionally typed with an upper bound, so that all references are subtypes of java/lang/String
for instance. Fields can additionally have one of the short types: byte, short, char, or boolean. These are treated identically to ints for execution purposes but have different storage.
Finally, there is the runtime type, which is the same as the verified static type, but in the case of object references, represents the actual type of the instance being referenced. Note that due to verifier laziness, there are some cases where the runtime type may not actually be a subtype of the verified type. For instance, a variable of declared type Comparable
can actually hold any object in Hotspot because the VM doesn't check interfaces at verification time.
Compile time information is not preserved except through optional attributes for reflection and debugging. This is because there's no reason to keep it.
Local variables have no explicit type information (except for the new StackMapTable attribute, but that's a technicality). Instead, when the class is loaded, the bytecode verifier infers a type for each value by running a static dataflow analysis. The purpose of this is not to catch bugs like compile time type checking might, because it is assumed that the bytecode already went through such checking at compile time.
Instead, the purpose of verification is to ensure that the instructions are not dangerous to the VM itself. For example, it needs to make sure that you aren't taking an integer and interperting it as an object reference, because that could lead to arbitrary memory access and hacking the VM.
So while bytecode values don't have explicit type information, they do have an implicit type which is the result of static type inference. The details of this vary based on the internal implementation details of each VM, though they are supposed to follow the JVM standard. But you'll only have to worry about that in handwritten bytecode.
Fields have an explicit type since the VM needs to know which type of data is being stored in it. Method parameters and return types are encoded in what is known as a method descriptor, also used in type checking. They're impossible to infer automatically because these values can come from or go anywhere, while type checking is done on a per class basis.
P.S. I left out a few minor details when talking about the verification types. Object types additionally track whether they have been initialized or not, and which instruction created them if uninitialized. Address types track the target of the jsr that created them.
Upvotes: 6
Reputation: 47183
That's a pretty old paper. Current class files do include types for local and stack variables. The types aren't stored in the method bytecode, but are stored in a StackMapTable
attribute attached to the method.
It is (and always was) possible to reconstruct the types of all local variables and stack elements by dataflow analysis without a StackMapTable
, but it is computationally expensive. Code with StackMapTable
s can be verified much faster. Although i have to confess that i don't see how verifying the StackMapTable
s can be faster than doing the analysis, but then i know almost nothing about this.
Upvotes: 4
Reputation: 44740
Java bytecode
retains type information about fields
,
method
returns
and parameters
but it does not,
as you asked,
contain type information for local variables
.
The
type information in the Java class file renders the task
of decompilation of bytecode
easier than decompilation of
machine code
. Decompiling Java bytecode, thus, requires
analysis of most local variable types, flattening of stack based
instructions and structuring of loops
and conditionals
.
The task of bytecode decompilation, however, is much
harder than compilation. You would see often decompilers
cannot fully perform their intended function
Upvotes: 2