Reputation: 731
I have a program that is basically looping through and doing a TON of adds in each loop.
So like b += .01 is happening probably 100 times in a loop.
So I expect the ratio of compute (adds) vs loads and stores instructions to be very high. However, unexpectedly, the more additions I do, the greater # of memory reads and writes I get.
int b = 0;
int i;
for (i = 0; i < 100000; i++){
b += .01 * (maybe 50 times)?)
}
I'm using the pin tool, and the memory reads and writes go up by a lot. Much faster than the additions. and I'm confused. I thought b was a local variable and as such, wasn't stored in memory but rather just the stack or in a cache. Why is this occurring?
I've looked at the assembly, and I see no usage of lw or sw anywhere.
Upvotes: 1
Views: 100
Reputation: 7330
By default compilers almost always put variables with automatic lifetime (e.g. int b=0;
) on the stack.
For example if I compile with GCC this snippet, which is close to what you wrote, but a little bit more correct :
int main()
{
int b = 0;
int i;
for (i = 0; i < 100000; i++) {
b++;
b++;
b++;
b++;
b++;
b++;
b++;
b++;
b++;
b++;
}
return b;
}
I get the following compiled code :
00000000004004b6 <main>:
4004b6: 55 push %rbp
4004b7: 48 89 e5 mov %rsp,%rbp
4004ba: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
4004c1: c7 45 f8 00 00 00 00 movl $0x0,-0x8(%rbp)
4004c8: eb 2c jmp 4004f6 <main+0x40>
4004ca: 83 45 fc 01 addl $0x1,-0x4(%rbp)
4004ce: 83 45 fc 01 addl $0x1,-0x4(%rbp)
4004d2: 83 45 fc 01 addl $0x1,-0x4(%rbp)
4004d6: 83 45 fc 01 addl $0x1,-0x4(%rbp)
4004da: 83 45 fc 01 addl $0x1,-0x4(%rbp)
4004de: 83 45 fc 01 addl $0x1,-0x4(%rbp)
4004e2: 83 45 fc 01 addl $0x1,-0x4(%rbp)
4004e6: 83 45 fc 01 addl $0x1,-0x4(%rbp)
4004ea: 83 45 fc 01 addl $0x1,-0x4(%rbp)
4004ee: 83 45 fc 01 addl $0x1,-0x4(%rbp)
4004f2: 83 45 f8 01 addl $0x1,-0x8(%rbp)
4004f6: 81 7d f8 9f 86 01 00 cmpl $0x1869f,-0x8(%rbp)
4004fd: 7e cb jle 4004ca <main+0x14>
4004ff: 8b 45 fc mov -0x4(%rbp),%eax
400502: 5d pop %rbp
400503: c3 retq
400504: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
40050b: 00 00 00
40050e: 66 90 xchg %ax,%ax
Note the addl $0x1,-0x4(%rbp)
instructions, those are incrementing our variable, the equivalent of b++
in the source. And we can see that it's on the stack (-0x4(%rbp)
), thus each of these instructions will counts as a load and a store. This is why you see such a high count of load/stores.
If you don't want your variable to go on the stack, you can enable optimizations and hope that the compiler will do the right thing, or you can pass a hint with the register
keyword, like this :
int main()
{
register int b = 0;
int i;
for (i = 0; i < 100000; i++) {
b++;
b++;
b++;
b++;
b++;
b++;
b++;
b++;
b++;
b++;
}
return b;
}
And you get the following compiled code :
00000000004004b6 <main>:
4004b6: 55 push %rbp
4004b7: 48 89 e5 mov %rsp,%rbp
4004ba: 53 push %rbx
4004bb: bb 00 00 00 00 mov $0x0,%ebx
4004c0: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%rbp)
4004c7: eb 22 jmp 4004eb <main+0x35>
4004c9: 83 c3 01 add $0x1,%ebx
4004cc: 83 c3 01 add $0x1,%ebx
4004cf: 83 c3 01 add $0x1,%ebx
4004d2: 83 c3 01 add $0x1,%ebx
4004d5: 83 c3 01 add $0x1,%ebx
4004d8: 83 c3 01 add $0x1,%ebx
4004db: 83 c3 01 add $0x1,%ebx
4004de: 83 c3 01 add $0x1,%ebx
4004e1: 83 c3 01 add $0x1,%ebx
4004e4: 83 c3 01 add $0x1,%ebx
4004e7: 83 45 f4 01 addl $0x1,-0xc(%rbp)
4004eb: 81 7d f4 9f 86 01 00 cmpl $0x1869f,-0xc(%rbp)
4004f2: 7e d5 jle 4004c9 <main+0x13>
4004f4: 89 d8 mov %ebx,%eax
4004f6: 5b pop %rbx
4004f7: 5d pop %rbp
4004f8: c3 retq
4004f9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
Note that the instructions for incrementing are now add $0x1,%ebx
, we can see that our variable is indeed stored in a register (here ebx
), as requested.
I thought b was a local variable and as such, wasn't stored in memory but rather just the stack or in a cache. Why is this occurring?
Local variables are usually stored in memory (on the stack). But you can change this behavior. With the second snippet I posted, you'll see a much smaller number of memory read/write operations, because b
is not stored in main memory anymore but in a register.
Upvotes: 1