Reputation: 573
#include <stdio.h>
int main() {
int i;
for(i=0;i<10000;i++){
printf("%d",i);
}
}
I want to do loop unrolling on this code using gcc but even using the flag.
gcc -O2 -funroll-all-loops --save-temps unroll.c
the assembled code i am getting contain a loop of 10000 iteration
_main:
Leh_func_begin1:
pushq %rbp
Ltmp0:
movq %rsp, %rbp
Ltmp1:
pushq %r14
pushq %rbx
Ltmp2:
xorl %ebx, %ebx
leaq L_.str(%rip), %r14
.align 4, 0x90
LBB1_1:
xorb %al, %al
movq %r14, %rdi
movl %ebx, %esi
callq _printf
incl %ebx
cmpl $10000, %ebx
jne LBB1_1
popq %rbx
popq %r14
popq %rbp
ret
Leh_func_end1:
Can somone plz tell me how to implement loop unrolling correctly in gcc
Upvotes: 7
Views: 5176
Reputation: 70382
Loop unrolling won't give you any benefit for this code, because the overhead of the function call to printf()
itself dominates the work done at each iteration. The compiler may be aware of this, and since it is being asked to optimize the code, it may decide that unrolling increases the code size for no appreciable run-time performance gain, and decides the risk of incurring an instruction cache miss is too high to perform the unrolling.
The type of unrolling required to speed up this loop would require reducing the number of calls to printf()
itself. I am unaware of any optimizing compiler that is capable of doing that.
As an example of unrolling the loop to reduce the number of printf()
calls, consider this code:
void print_loop_unrolled (int n) {
int i = -8;
if (n % 8) {
printf("%.*s", n % 8, "01234567");
i += n % 8;
}
while ((i += 8) < n) {
printf("%d%d%d%d%d%d%d%d",i,i+1,i+2,i+3,i+4,i+5,i+6,i+7);
}
}
Upvotes: 7
Reputation: 20818
Replace
printf("%d",i);
with
volatile int j = i;
and see if the loop gets unrolled.
Upvotes: 1
Reputation: 145829
gcc
has maximum loops unroll parameters.
You have to use -O3 -funroll-loops
and play with parameters max-unroll-times
, max-unrolled-insns
and max-average-unrolled-insns
.
Example:
-O3 -funroll-loops --param max-unroll-times=200
Upvotes: 6