Hamim Mahmud
Hamim Mahmud

Reputation: 33

Found an unspecified JVM Bytecode (0xe2) in java class file

I'm recently developing a program which can analyze a java class file. After running the program this was it's output: program output 1 program output 2 program output 3

class test_1 {

    public static String a = "Hello World";

    public static void main(String[] args) {
        int j = 0;
        for(int i = 0;i<10;i++) {
            System.out.println(a);
            j = j + j*j +j/(j+1);
        }
    }
}

I got a bytecode 0xe2 which is not specified in jvm specification 14. What does 0xe2 do??

Upvotes: 0

Views: 135

Answers (2)

Andreas
Andreas

Reputation: 159086

Your program is outputting every byte as-if they are bytecode instructions, ignoring the fact that many instructions have parameters, so they are multi-byte instructions.

E.g. your program is incorrectly outputting the constructor as follows:

2a: aload_0
b7: invokespecial
00: nop
01: aconst_null
b1: return

If you run javap -c test_1.class, you will however see:

0: aload_0
1: invokespecial #1   // Method java/lang/Object."<init>":()V
4: return

The number before the colon is the offset, not the bytecode. As you can see, offsets 2 and 3 are missing, because the invokespecial instruction uses 2 bytes for parameters, which is documented:

Format

invokespecial
indexbyte1
indexbyte2

Description

The unsigned indexbyte1 and indexbyte2 are used to construct an index into the run-time constant pool of the current class (§2.6), where the value of the index is (indexbyte1 << 8) | indexbyte2.

With the 2 bytes being 00 and 01, index is 1, so the bytecode instruction is as javap showed: invokespecial #1

If you then look at the constant pool output, you'll see that constant #1 is a methodref to the Object no-arg constructor.

Your specific question is related to bytecodes a7 ff e2, which is not 3 instructions, but the 3-byte instruction for goto:

Format

goto
branchbyte1
branchbyte2

Description

The unsigned bytes branchbyte1 and branchbyte2 are used to construct a signed 16-bit branchoffset, where branchoffset is (branchbyte1 << 8) | branchbyte2.

Meaning that ff e2 is branchoffset = 0xffe2 = -30, which means that instead of

a7: goto
ff: impdep2
e2: (null)

You program should have printed something like:

a7 ff e2: goto -30

Upvotes: 1

Aplet123
Aplet123

Reputation: 35512

You fail to account for multibyte opcodes. From the reference for goto, the format is:

goto
branchbyte1
branchbyte2

The unsigned bytes branchbyte1 and branchbyte2 are used to construct a signed 16-bit branchoffset, where branchoffset is (branchbyte1 << 8) | branchbyte2. Execution proceeds at that offset from the address of the opcode of this goto instruction. The target address must be that of an opcode of an instruction within the method that contains this goto instruction.

goto is 0xa7, and it should be followed by 2 bytes that denote the branch location, making the instruction 3 bytes wide. Your code ignores this, disassembling 1 byte, then treating the next 2 bytes as valid instructions, which they aren't.

Upvotes: 2

Related Questions