Reputation: 33
I'm recently developing a program which can analyze a java class file. After running the program this was it's output:
class test_1 {
public static String a = "Hello World";
public static void main(String[] args) {
int j = 0;
for(int i = 0;i<10;i++) {
System.out.println(a);
j = j + j*j +j/(j+1);
}
}
}
I got a bytecode 0xe2 which is not specified in jvm specification 14. What does 0xe2 do??
Upvotes: 0
Views: 135
Reputation: 159086
Your program is outputting every byte as-if they are bytecode instructions, ignoring the fact that many instructions have parameters, so they are multi-byte instructions.
E.g. your program is incorrectly outputting the constructor as follows:
2a: aload_0
b7: invokespecial
00: nop
01: aconst_null
b1: return
If you run javap -c test_1.class
, you will however see:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
The number before the colon is the offset, not the bytecode. As you can see, offsets 2 and 3 are missing, because the invokespecial
instruction uses 2 bytes for parameters, which is documented:
Format
invokespecial indexbyte1 indexbyte2
Description
The unsigned
indexbyte1
andindexbyte2
are used to construct an index into the run-time constant pool of the current class (§2.6), where the value of the index is(indexbyte1 << 8) | indexbyte2
.
With the 2 bytes being 00
and 01
, index is 1, so the bytecode instruction is as javap
showed: invokespecial #1
If you then look at the constant pool output, you'll see that constant #1 is a methodref
to the Object
no-arg constructor.
Your specific question is related to bytecodes a7 ff e2
, which is not 3 instructions, but the 3-byte instruction for goto
:
Format
goto branchbyte1 branchbyte2
Description
The unsigned bytes
branchbyte1
andbranchbyte2
are used to construct a signed 16-bit branchoffset, where branchoffset is(branchbyte1 << 8) | branchbyte2
.
Meaning that ff e2
is branchoffset = 0xffe2 = -30
, which means that instead of
a7: goto
ff: impdep2
e2: (null)
You program should have printed something like:
a7 ff e2: goto -30
Upvotes: 1
Reputation: 35512
You fail to account for multibyte opcodes. From the reference for goto, the format is:
goto
branchbyte1
branchbyte2
The unsigned bytes branchbyte1 and branchbyte2 are used to construct a signed 16-bit branchoffset, where branchoffset is (branchbyte1 << 8) | branchbyte2. Execution proceeds at that offset from the address of the opcode of this goto instruction. The target address must be that of an opcode of an instruction within the method that contains this goto instruction.
goto
is 0xa7, and it should be followed by 2 bytes that denote the branch location, making the instruction 3 bytes wide. Your code ignores this, disassembling 1 byte, then treating the next 2 bytes as valid instructions, which they aren't.
Upvotes: 2