Reputation: 7672
I am writing instrumentation on Dalvik bytecode which performs some logging for various method call entries. Specifically, at various method call sites, I will insert a set of instructions which collects up the parameters, puts them in an Object[]
array, and then passes that to a logging function.
This is all fine and well, I have implemented and gotten past all of the kludges for most apps. But I'm encountering one particularly impenetrable Dalvik verifier error:
java.lang.VerifyError: Verifier rejected class io.a.a.g: void io.a.a.g.r()
failed to verify: void io.a.a.g.r(): [0x570] register v5 has type Reference:
java.lang.Object but expected Precise Reference: java.lang.String
I looked at the code that is being generated by my instrumentation, and all I'm doing is putting register v5 in an array of objects.
I have a few questions here:
[0x570]
points into the middle of a bytecode instruction, so it doesn't clearly map to any instructions: the instructions around there don't involve v5
.EDIT:
Here's a dump of the bytecode of the method I'm speaking about. https://gist.github.com/kmicinski/c8382f0521b19643bb24379d91c47d36 As you can see, 0x570 isn't the beginning of an instruction, and (as far as I can tell) there isn't any place where r5 conflicts with a String where it should be an object.
Upvotes: 8
Views: 5570
Reputation: 7672
I'd like to add my answer, since others have been so helpful in donating their time to answer a tricky question that probably doesn't generalize much!
As @Antimony points out, there was a control path in my code that began at an exception handler, stored the exception in v5
(causing v5
to be an Object
) and then goto
'd a point within an exception handler. That exception handler then caused v5
to be used as a string, causing the verifier error.
In the app's original code, the only thin at the target of that goto
was a return-void
instruction. Because of this, the Dalvik verifier did not propagate the path through to the exception handler.
Unfortunately, when I rewrote this app, it caused the target of that exception handler to contain more than just this return-void
instruction, making the verification reason through that block and into the caught exception handler. In particular, before the return-void
, I inserted a call to Logger.logMethodExit
, which the verifier then assumes could transfer control back to an exception handler (:BF0
in this case) and eventually to the place where v5
was used as a string. In the original app, it was killed (in the gen/kill dataflow sense). But upon rewriting, I included this extra invocation breaking the dataflow invariants... Crud.
I think I know how to fix this in my implementation, but it was sure a pain to figure out!
More general lessons learned here:
Verifier error offsets are really just 2*index within bytecode
Unlike JVM bytecode, Dalvik bytecode considers a subset of opcodes non-throwable, including return
. This will affect dataflow analysis
Precise reference means something is constrained to be a particular refinement of Object in one basic block, and an Object
in another (though this error seems a little esoteric to me..)
When you rewrite bytecode you need to be cognizant of the gen/kill sets you're implicitly working around, and in particular the return-*
instructions will immediately kill things, whereas jumping to the beginning of a basic block within a try..
will continue to keep those things live.
Upvotes: 3
Reputation: 39451
If you look closely at the error, it is telling you that you're passing an Object
, where a String
is expected. Anyway, there isn't much more that can be said unless you post the actual bytecode that is causing the problem.
Are you sure that 0x570 points to the middle of an instruction? It shouldn't. Anyway, the way you would go about debugging it is to look at the relevant instruction and figure out why r5 is an Object when it's supposed to be a String. Or you could post the bytecode so I could take a look.
Edit: Now that you've posted the code, there is in fact a path which results in v5 being Object, but it is a bit subtle
The exception handler
.catch JSONException {:5D8 .. :938} :BDE
jumps to :BDE
The code for the exception handler stores the caught exception in v5, meaning that v5 is no longer a String at this point. It then jumps to :162
:BDE
00000BDE move-exception v5
00000BE0 const v0, 0x00488B36
00000BE6 invoke-static Logger->logBasicBlockEntry(I)V, v0
00000BEC goto/16 :162
:162
is within the range of another exception handler: .catch ClassNotFoundException {:2E .. :594} :BF0
:Bf0
leaves v5 untouched and jumps to :A28
:BF0
00000BF0 move-exception v6
00000BF2 const v0, 0x00488B3E
00000BF8 invoke-static Logger->logBasicBlockEntry(I)V, v0
00000BFE goto/16 :A28
:A28
is the beginning of a code block which assumes that v5 is String. In particular, on instruction :AE0
, v5 is passed to a function taking a String.
00000AE0 invoke-virtual StringBuilder->append(String)StringBuilder, v7, v5
0xAE0 is exactly twice 0x570, which explains the offset shown in the error, once you adjust for code units as JesusFreke suggested.
Note that this isn't necessarily the only broken code path, it's just the first one I found while looking through your code. However, one bad path is sufficient to unify v5's type with JSONException and hence turn it into Object.
Upvotes: 4
Reputation: 20262
0x570 is likely the offset in code units, which are two bytes each. So the byte offset is actually 0xAE0, which does correspond with an instruction, and that instruction does reference v5.
I expect what's happening is that there is code somewhere that stores a string in v5, but there's another code path that merges in between where the string is stored in v5 and where it's used, and that code path has a different object type stored in v5. When the code paths merge, it uses the common superclass of the two types as the type of the register. So if the two types are completely unrelated, java.lang.Object will be the superclass.
What you can do to debug this issue is run baksmali using the --register-info ARGS,DEST,FULLMERGE
option (and also --code-offsets
, so you can find 0xAE0 easily), and then look backwards from 0xAE0 and see where the type of v5 is set to be an Object.
Upvotes: 3