Reputation: 63
I'm faced with finding a bug in a 10,000+ lines assembly program, and maintaining it in the future. It concerns assembler code for a PIC18 microchip microcontroller. The code was written by a former colleague who is no longer available, and he left no documentation and his code is very poorly commented.
The code also uses a number of direct goto's and bra's, and relative ones ($+0x06 for instance) so I'm very affraid to edit it.
I would rather decompile his assembly source code into 'new' C-code (or C++) and work from there. I would like to know:
Upvotes: 2
Views: 3802
Reputation: 1649
While decompiling itself is possible to some extent, whether it's applicable in your situation is another thing which depends on the available software. Personally I would start deciphering the code manually and roughly reconstructing the code with C. Of course, it's a tedious task and probably involves lots of errors and bugs, but if you are supposed to maintain the codebase then it would be obvious to try to make that task as easy as possible later on - and perhaps the future maintainers would appreciate your job too.
Upvotes: 0
Reputation: 71566
The non-mips pics are extremely compiler unfriendly, using C is slow and costly in terms of consuming limit resources.
gotos and branches are how processors work, nothing wrong with them.
I would start by removing the relative branches and replacing them with labels, it should be a trivial check that you have added no new bugs by comparing the binary of the pre-modified code and post modified code. Once the relative branches all use labels you can then add or remove code with less fear of breaking something else.
Decompilers dont work and cant work. It would be like trying to take the french fries you had for lunch out of your intestinal track and trying to restore the original potato (without the skin of course, unless they were skin-on fries). The original source has been cooked and processed a number of times before it becomes the asm that is fed to the assembler, then it is cooked (well...warmed up) another time if you use objects and a linker. Too much material is lost on the way that cannot be recovered.
It is very possible to do a static binary translation, and get something that is in the C language but is significantly harder to read and maintain than the asm, esp for architectures with flags. The translator is not going to be perfect removing all of the dead code. Every add instruction is going to be the add itself plus code to detect the overflow and if overflow set the flag, else clear, if result equals zero set the flag, else clear if msbit is set set the n bit else clear, if there is a v bit then if signed overflow then set the v bit else clear, one simple line of asm becomes 10 lines of C code unless the translator can automatically remove that dead code.
Upvotes: 2
Reputation: 399989
Here's how I see it:
For more information about the problem space, check out the Boomerang project's FAQ.
Upvotes: 4