Reputation: 1607

Java bytecode: types of local variables?

According to this article http://slurp.doc.ic.ac.uk/pubs/observing/linking.html#assignment:

Due to the differences in information between Java code and bytecode (bytecode does not contain the types of local variables), the verifier does not need to check subtypes for assignments to local variables, or to parameters.

My question: Why does the bytecode not contain type information for local variables, whilst it does indeed contain type information for the parameters and return value?

Upvotes: 11

Answers (3)

Antimony

Reputation: 39451

First off, there are several different notions of type. There are the compile time types, which include generics. However, generics don't exist after compile time.

There is the verification inferred static type of a variable, which can be int, float, long, double, returnaddress, or an object reference. Object references are additionally typed with an upper bound, so that all references are subtypes of java/lang/String for instance. Fields can additionally have one of the short types: byte, short, char, or boolean. These are treated identically to ints for execution purposes but have different storage.

Finally, there is the runtime type, which is the same as the verified static type, but in the case of object references, represents the actual type of the instance being referenced. Note that due to verifier laziness, there are some cases where the runtime type may not actually be a subtype of the verified type. For instance, a variable of declared type Comparable can actually hold any object in Hotspot because the VM doesn't check interfaces at verification time.

Compile time information is not preserved except through optional attributes for reflection and debugging. This is because there's no reason to keep it.

Local variables have no explicit type information (except for the new StackMapTable attribute, but that's a technicality). Instead, when the class is loaded, the bytecode verifier infers a type for each value by running a static dataflow analysis. The purpose of this is not to catch bugs like compile time type checking might, because it is assumed that the bytecode already went through such checking at compile time.

Instead, the purpose of verification is to ensure that the instructions are not dangerous to the VM itself. For example, it needs to make sure that you aren't taking an integer and interperting it as an object reference, because that could lead to arbitrary memory access and hacking the VM.

So while bytecode values don't have explicit type information, they do have an implicit type which is the result of static type inference. The details of this vary based on the internal implementation details of each VM, though they are supposed to follow the JVM standard. But you'll only have to worry about that in handwritten bytecode.

Fields have an explicit type since the VM needs to know which type of data is being stored in it. Method parameters and return types are encoded in what is known as a method descriptor, also used in type checking. They're impossible to infer automatically because these values can come from or go anywhere, while type checking is done on a per class basis.

P.S. I left out a few minor details when talking about the verification types. Object types additionally track whether they have been initialized or not, and which instruction created them if uninitialized. Address types track the target of the jsr that created them.

Upvotes: 6

Tom Anderson

Reputation: 47183

That's a pretty old paper. Current class files do include types for local and stack variables. The types aren't stored in the method bytecode, but are stored in a StackMapTable attribute attached to the method.

It is (and always was) possible to reconstruct the types of all local variables and stack elements by dataflow analysis without a StackMapTable, but it is computationally expensive. Code with StackMapTables can be verified much faster. Although i have to confess that i don't see how verifying the StackMapTables can be faster than doing the analysis, but then i know almost nothing about this.

Upvotes: 4

Adil Shaikh

Reputation: 44740

Java bytecode retains type information about fields, method returns and parameters but it does not, as you asked, contain type information for local variables.

The type information in the Java class file renders the task of decompilation of bytecode easier than decompilation of machine code. Decompiling Java bytecode, thus, requires analysis of most local variable types, flattening of stack based instructions and structuring of loops and conditionals. The task of bytecode decompilation, however, is much harder than compilation. You would see often decompilers cannot fully perform their intended function

Upvotes: 2

Java bytecode: types of local variables?

Answers (3)

Related Questions