Reputation: 1469
I was doing some research and came across this article which describes the restrict
keyword quite well and I think I have a decent grasp of its valid usage as a result. To quote the article's quote of the definition of restrict
:
A new feature of C99: The restrict type qualifier allows programs to be written so that translators can produce significantly faster executables. [...] Anyone for whom this is not a concern can safely ignore this feature of the language. -- From Rationale for International Standard - Programming Languages - C [std.dkuug.dk] (6.7.3.1 Formal definition of restrict)
In my case this happens to be true; I'm writing embedded code which could benefit for higher performance compiled code and I am not comfortable enough with assembly to utilize it for help. So I am considering utilizing restrict
in appropriate places. Specifically, I have several functions which loop and take pointer arguments of the same type, so it seems like my code could benefit from this on the surface.
However, the article states:
You should expect code where all aliasing information is declared with the restrict keyword to almost always perform significantly better, and never worse, than with unrestricted pointers. This is especially true on superscalar RISC, or RISC-like architectures with large register files, like the PowerPC or MIPS R4000.
I am working with an ARM Cortex-M4 with the GCC toolchain. I don't have enough understanding of the various processor architectures to compare the meaning of large register files
to my use case, but given the example processors given and a quick Google search, I am pretty sure I don't constitute that list, though perhaps the application class ARM processors would be.
So with all this in mind, would I see a benefit beyond micro-optimization? I fully expect to profile it one way or the other, but I was wondering about the qualitative effects of restrict
in the context of ARM Cortex-M4/GCC, and specifically, if its pipeline could make use of the changes or if there is some other factor that will prevent a major benefit, such as not being able to schedule memory access.
Upvotes: 3
Views: 746
Reputation: 21886
Restrict
keyword allows compiler to remove dependencies between some memory operations in program. This opens opportunities to large number of optimizations e.g. tighter instruction schedules (which in turn enables higher benefits from loop unrolling), autovectorization or combining multiple scalar loads/stores to vectorized variants (ldm
/stm
in case of ARM).
Upper class, out-of-order architectures (e.g. Cortex-A in ARM's case) try very hard (and spend a lot of power) to perform these optimizations at runtime, by dynamically analyzing and reordering the instruction stream (even there restrict
may enable higher-level optimizations like autovectorization). Lower-end, embedded cores like M4 lack such capabilities and so restrict
annotations are absolutely crucial for performance there.
As other commenters have noted, semantics of restrict
is not exactly trivial so I suggest to utilize it only in hot loops.
Upvotes: 0