wondering
wondering

Reputation: 373

Performance of local pointer/reference for convenience

Is there any performance downside when using one of the following methods of accessing an array element?

int someRandomArrayWithLongName[] = {0, 1, 2, 3, 4};

// case (1): access with alias reference
int &elem = someRandomArrayWithLongName[1];
while (!exit)
{
   ++elem;
}

// case (2): pointer access
int *elem = &someRandomArrayWithLongName[1];
while (!exit)
{
   ++(*elem);
}

// default case (3): conventional element access
while (!exit)
{
   ++someRandomArrayWithLongName[1];
}

In particular, will the compiler recognize the aliasing and do its magic to prevent unnecessary memory allocations (case 1 and 2)? Or is it better to just use case 3?

EDIT: I used http://gcc.godbolt.org/ as suggested in the comments. Turns out the three cases produce the exact same assembly code (source):

test1():
    movzbl  exit(%rip), %eax
    testb   %al, %al
    jne .L1
    movl    someRandomArrayWithLongName+4(%rip), %eax
    addl    $1, %eax
.L3:
    movzbl  exit(%rip), %edx
    movl    %eax, %ecx
    addl    $1, %eax
    testb   %dl, %dl
    je  .L3
    movl    %ecx, someRandomArrayWithLongName+4(%rip)
.L1:
    rep ret
test2():
    movzbl  exit(%rip), %eax
    testb   %al, %al
    jne .L8
    movl    someRandomArrayWithLongName+4(%rip), %eax
    addl    $1, %eax
.L10:
    movzbl  exit(%rip), %edx
    movl    %eax, %ecx
    addl    $1, %eax
    testb   %dl, %dl
    je  .L10
    movl    %ecx, someRandomArrayWithLongName+4(%rip)
.L8:
    rep ret
test3():
    movzbl  exit(%rip), %eax
    testb   %al, %al
    jne .L14
    movl    someRandomArrayWithLongName+4(%rip), %eax
    addl    $1, %eax
.L16:
    movzbl  exit(%rip), %edx
    movl    %eax, %ecx
    addl    $1, %eax
    testb   %dl, %dl
    je  .L16
    movl    %ecx, someRandomArrayWithLongName+4(%rip)
.L14:
    rep ret
exit:
    .zero   1

Upvotes: 2

Views: 217

Answers (2)

molbdnilo
molbdnilo

Reputation: 66371

The "unnecessary memory allocations" you're worried about only involves a tiny bit of stack space in the worst case, and takes literally no extra time, so it's nothing to worry about.

That said, any modern compiler worth its salt will most likely generate identical code for all three in optimised builds.
In practice, use the one you find most readable (and least error-prone) and rewrite if it turns out to be too slow.

For improved performance you should strive to avoid accessing memory by keeping modifications as local as possible so they can be performed in registers, like this:

int increment = 0;
while (!exit)
{
   ++increment;
}
someRandomArrayWithLongName[1] += increment;

Upvotes: 4

BeeOnRope
BeeOnRope

Reputation: 64895

In general the third option will perform about as well as your compiler can make array access perform, while the performance of the first two options depends on the compiler's ability to trace the origin of the pointer back to the array.

In particular, a lot of optimizations might be suppressed when writes to arbitrary pointers occurs, as in (1) and (2) - since the compiler has a hard time guaranteeing that various objects in the loop don't alias. Since the access in (3) is directly to an array of integers, more aggressive optimizations can often be applied (especially enregistration of loop variables).

In the trivial example you gave, aliasing won't be a problem, since the loop does nothing but repeatedly access the same element. I assume your question is more general however - that you are interested in the performance of the approaches in more interesting loops that may very well contain operations where aliasing issues can arise.

Determining whether aliasing or any other (de)optimization has actually occurred is relatively difficult simply by staring at the source code. For example modern compilers might handle all the situations above equivalently, when elem is declared in the same function, but not if elem is passed in from the outside (since compilers have more limited ability to optimize across function calls).

So while there is nothing wrong with understanding the theory, the best approach is to test if it really matters. Time the various approaches in the actual way they are used in real code, and if you are up for it, take a peek at the disassembly.

Upvotes: 1

Related Questions