Vadim
Vadim

Reputation: 1241

Does it make sense to optimize data hazards if a processor supports out-of-order execution?

Is there still a possibility for programmers to optimize data hazards for processors that support full out-of-order executionn?

Upvotes: 2

Views: 524

Answers (1)

Hadi Brais
Hadi Brais

Reputation: 23669

A processor that is capable of executing out-of-order is not necessarily capable of eliminating any data hazards. An out-of-order execution implementation may not include register renaming and therefore WAW and WAR hazards will cause pipeline stalls.

However, most modern OoOE processors implement register renaming thereby eliminating WAW and WAR hazards, but not RAW hazards. If a floating-point division instruction is followed by a sequence of instructions that require the result of the division, then the pipeline might stall for a long time. Another example is where a branch instruction is dependent on a load instruction that causes a page fault. Irrespective of whether the prediction for that branch was correct, the reorder buffer might become full or almost full possibly causing a stall. If branch was mispredicted, the penalty would be much higher. So you usually don't have to worry about register WAW and WAR hazards, but RAW hazards are important.

There are a few techniques that can be used to reduce the impact of RAW hazards:

  • SIMD instructions can be used to fully overlap the latency of WAR dependencies of multiple data elements.
  • Fusing loops that contain dependency chains that are independent from each other so that the chains can be executed in parallel in a superscalar CPU. This increases the utilization of the available execution units.
  • Using instructions with lower latency. For example, a multiply instruction by a power of 2 can be replaced with a shift left instruction.

Typically, an optimizing compiler is capable of performing these optimizations, although it may still produce sub-optimal machine code.

Memory dependencies are also important. In particular, memory RAW dependencies will incur a penalty if the store result cannot be forward to the load due to some structural limitation. Memory WAW and WAR hazards have no penalties because most processors retire instructions in program order. That said, in an architecture with a strong memory ordering model such as x86, typically, all stores must be performed in program order, irrespective of WAW dependencies.

There are many other possible performance issues. You can refer to the optimization guide of the processor and/or architecture you're developing for.

Upvotes: 2

Related Questions