Reputation: 61
I used the Apple built-in "otool" command with "-Vvtd" switches to dump a Mach-O i386 binary, redirected to a .s file. I have tried unsuccessfully to use nasm and GAS assemblers to recompile the code on a PPC machine ("as"-binary in the i386 directory of gcc/darwin and "as"-binary in the ppc directory as well). The output reads something like:
some_topmost_label:
(__TEXT,__text) section
_default_pager:
00112000 pushl %ebp
00112001 movl %esp,%ebp
00112003 pushl %edi
00112004 pushl %esi
00112005 pushl %ebx
00112006 subl $0x3c,%esp
00112009 movl _default_pager_internal_count,%ebx
0011200f addl _default_pager_external_count,%ebx
00112015 leal 0x00000004(,%ebx,4),%ebx
There is a data section as well, going like:
...
(__DATA,__data) section
00421000 02 00 00 00 04 00 00 00 00 40 00 00 28 64 65 66
...
00449bc0 50 00 3d 00 00 00 00 00 00 00 00 00 00 00 00 00
00449bd0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
...
I am intent on running the binary in Mac on PPC, hence the recompiling effort; I have tried removing the addresses in the leftmost column to make the syntax more "AT&T"-style, leaving them, etc. I DO NOT want to make any edits to the existing code structure (this is not exactly a reverse-engineering effort, per se, just some customization). However, if I have to do any editing, I would very much like it to be strictly for making the existing, unadulterated code for i386 run as is on PPC.
I will very much appreciate your help.
Regards
Upvotes: 0
Views: 330
Reputation: 61
Decompilers can produce C files (as I have tried) which can be used to compile from source on a different architecture (which I have also tried). The experience was dicey at best. I'm still working on it and will likely still be for some time.
As an alternative, emulation can be implemented to run a binary/executable for i386 on ppc. This is a quick, but potentially less effective, route.
In addition, I feel it confirmed that assembly-to-assembly would be the most painful route as opposed to using the C programming language as an intermediate (by decompiling the i386 binary to C and recompiling the C on the target architecture).
In the case of decompiling: what if it produces a quarter-million lines of code? You may need a team :)
Upvotes: 1
Reputation:
In assembly language, every "statement" is an instruction that the processor can execute. The instructions are represented in a human-readable text format (if you're the right kind of human) but still, every instruction name (e.g. movl
) and register (e.g. %esp
) and memory reference (e.g. 0x00000004(,%ebx,4)
) that exists in assembly directly corresponds to an implementation detail of the processor.
So every processor really has its own assembly language. Dumping and re-assembling doesn't get you anywhere. Not even within a set of related processors - if you take some 32-bit x86 code that was compiled SSE3 optimizations enabled and dump it, you'll have assembly code with SSE3 instructions. Re-assembling it won't get you a program that can run on a slightly older x86-32 processor.
There might be a chance, if your executable is old enough, that it's a "fat binary". During the period when PPC and x86 Macs were both supported by Apple, they would pack the compiled PPC and x86 code together in a single file. Judging by this answer you can detect fat binaries with the file
command.
But chances are you have to do a lot more work than you were expecting.
PPC doesn't have a movl
instruction (or any other kind of mov
- it handles loads and stores separately). It doesn't have a dedicated stack register like %esp
, although r1
is a safe bet. It doesn't have anything like the addressing mode in 0x00000004(,%ebx,4)
- that's a register being multiplied by 4 and then adding the constant 4 - in PPC you'll have to load the constant into a different register with one instruction, then shift (*4
= <<2
) the register in another instruction, then add those intermediate results together in a third instruction. This is not a matter of whether the instructions are written in "source form" or "binary form". It's a matter of the instructions in the original code not existing at all on the PPC.
Upvotes: 4