Peterdk
Peterdk

Reputation: 16015

How to programmaticaly decompile values into source literal?

I'm busy with a simple decompiler for Android. I want to make a nice decompiled view. I use dex2jar.

Let's say I have a field declaration: public byte[] test = new byte[2]; With the FieldVisitor I get:

 public as modifier
 byte[] as type
 test as name
 and an Object as value <--

Is it possible if you have an object like byte[2], to get the new byte[2] literal back?

Upvotes: 1

Views: 250

Answers (1)

millimoose
millimoose

Reputation: 39960

The code:

public class Foo {
    public byte[] bar = new byte[3];
}

compiles to the same as:

class Foo2 {
    public byte[] bar;

    public Foo2() {
        this.bar = new byte[3];
    }
}

There is no "literal" here, field initialisers and initialiser blocks just get prepended (I think in source code order) to the code of every constructor – the information you're looking for isn't preserved. You'd have to look at the decompiled code of those constructors and analyze that somehow, but that'd be ambiguous.

The opcodes for this constructor are:

0:  aload_0
1:  invokespecial   #1; //Method java/lang/Object."<init>":()V
4:  aload_0
5:  iconst_3
6:  newarray byte
8:  putfield    #2; //Field bar:[B
11: return

The indices 4 through 8 correspond to the line this.bar = new byte[3];. They mean roughly:

  1. Push the this reference onto the stack.
  2. Push the integer 3 onto the stack.
  3. Pop an integer (the 3) top of the stack, create a byte array of that length, push the array onto the stack.
  4. Set the value that's on the top of the stack (the byte array) as the value of field #2 (that's bar) of the object that's second-from-the-top on the stack (this). (Also, pop the two off the stack.)

This doesn't really map to the original Java source very well; as you see, the part that corresponds to "new byte[3]" is inserted in the middle of the part that implements "this.bar = …" and things happen out of order even for an expression as simple as this. Reconstructing statements from bytecode probably isn't going to be trivial – they aren't delimited explicitly, a statement ends when you pop everything off a stack.

Upvotes: 1

Related Questions