philkark
philkark

Reputation: 2457

Performance of chained public member access compared to pointer

Since I couldn't find any question relating to chained member access, but only chained function access, I would like to ask a couple of questions about it.

I have the following situation:

for(int i = 0; i < largeNumber; ++i)
{
  //do calculations with the same chained struct:
  //myStruct1.myStruct2.myStruct3.myStruct4.member1
  //myStruct1.myStruct2.myStruct3.myStruct4.member2
  //etc.
}

It is obviously possible to break this down using a pointer:

MyStruct4* myStruct4_pt = &myStruct1.myStruct2.myStruct3.myStruct4;
for(int i = 0; i < largeNumber; ++i)
{
  //do calculations with pointer:
  //(*myStruct4_pt).member1
  //(*myStruct4_pt).member2
  //etc.
}

Is there a difference between member access (.) and a function access that, e.g., returns a pointer to a private variable?

Will/Can the first example be optimized by the compiler and does that strongly depend on the compiler?

If no optimizations are done during compilation time, will/can the CPU optimize the behaviour (e.g. keeping it in the L1 cache)?

Does a chained member access make a difference at all in terms of performance, since variables are "wildly reassigned" during compilation time anyway?

I would kindly ask to leave discussions out regarding readability and maintainability of code, as the chained access is, for my purposes, clearer.

Update: Everything is running in a single thread.

Upvotes: 0

Views: 123

Answers (1)

marc
marc

Reputation: 6223

This is a constant offset that you're modifying, a modern compiler will realize that.

But - don't trust me, lets ask a compiler (see here).

#include <stdio.h>

struct D { float _; int i; int j; };

struct C { double _; D d; };

struct B { char _; C c; };

struct A { int _; B b; };

int bar(int i);
int foo(int i);

void foo(A &a) {
  for (int i = 0; i < 10; i++) {
    a.b.c.d.i += bar(i);
    a.b.c.d.j += foo(i);
  }
}

Compiles to

foo(A&):
    pushq   %rbp
    movq    %rdi, %rbp
    pushq   %rbx
    xorl    %ebx, %ebx
    subq    $8, %rsp
.L3:
    movl    %ebx, %edi
    call    bar(int)
    addl    %eax, 28(%rbp)
    movl    %ebx, %edi
    addl    $1, %ebx
    call    foo(int)
    addl    %eax, 32(%rbp)
    cmpl    $10, %ebx
    jne .L3
    addq    $8, %rsp
    popq    %rbx
    popq    %rbp
    ret

As you see, the chaining has been translated to a single offset in both cases: 28(%rbp) and 32(%rbp).

Upvotes: 2

Related Questions