Analysing assembly code performance

Question

I'm new to stack overflow and hope to get some advice on how to approach the problem I'm having. Having little assembly experience I am having a difficult time reasoning about the performance characteristics of apiece of code I have. The code is written in C on an PowerPC architecture (an old Apple G5). Running the code with O3 and some other optimization the code actually runs about 30% slower than with just O3. The difference between the assembly code boils down to a couple of instructions (say 3-4) and their arrangement.

My problem is due to my inexperience I am having difficulty in understanding why the assembly output perform worse in on case and better in the other. Tools such as oprofile are not really helpful here and looking at the official IBM instruction documentation does not give any insight (at least of what I have seen so far at least) on the performance characteristics of a perticular instruction. How does one approach these kind of analysis problems? As mentioned, I have little experience with assembly and pipeline analysis and thus I would appreciate any suggestions on how one usually approach these kind of problems. Are there any tools out there that can aid me?

Also, I am not really interested in why the compiler generated the code the way it did (and in a sense I am not really interested in how the original C code works), I'm really only interested in understanding the assembly performance analysis.

Update

I just want to give a brief update on the problem - by using a PowerPC pipeline simulator by IBM it was possible to see exactly what happened in the pipeline and thus it became much easier to understand the problem (it turned out to be a problem related to issue queues being full and formation of dispatch groups). I suggest to anyone looking at similar problems to use a pipeline simulator, it will help a lot in understanding the performance of your program! Due to the complexity of powerful machines, it seems very difficult to analyze the performance characteristics of a program without the use of pipeline simulator. This probably means that in order to truly understand how your program impacts performance it is necessary to understand the architecture the code is being run on.

Analysing assembly code performance

Answers (1)

Related Questions