Reputation: 51
I have some legacy code to understand and I stumbled upon the fact that inside the code the same struct is being accessed very very often. Would it make any difference if I save the content of the struct beforehand and then access the local copy instead of accessing through the pointer?
I already compared some testcode via a online assembler to see if it would optimize the code. Done that with https://godbolt.org/ ARM64 gcc8.2
Variant A
typedef struct STRUCT_D{
int myInt1IND;
int myInt2IND;
int myInt3IND;
int myInt4IND;
int myInt5IND;
int myInt6IND;
int myInt7IND;
int myInt8IND;
int myInt9IND;
} STRUCT_D;
typedef struct STRUCT_C{
STRUCT_D myStructInDIntINC;
} STRUCT_C;
typedef struct STRUCT_B{
STRUCT_C * myPointerB;
} STRUCT_B;
typedef struct STRUCT_A{
STRUCT_B * myPointerA;
} STRUCT_A;
int square(void) {
struct STRUCT_C myStructC;
struct STRUCT_B myStructB;
struct STRUCT_A myStructA;
struct STRUCT_A* startPointer;
myStructC.myStructInDIntINC.myInt1IND = 55;
myStructB.myPointerB = &myStructC;
myStructA.myPointerA = &myStructB;
startPointer = &myStructA;
int myresult =
startPointer->myPointerA->myPointerB->myStructInDIntINC.myInt1IND +
startPointer->myPointerA->myPointerB->myStructInDIntINC.myInt2IND +
startPointer->myPointerA->myPointerB->myStructInDIntINC.myInt3IND +
startPointer->myPointerA->myPointerB->myStructInDIntINC.myInt4IND +
startPointer->myPointerA->myPointerB->myStructInDIntINC.myInt5IND +
startPointer->myPointerA->myPointerB->myStructInDIntINC.myInt6IND +
startPointer->myPointerA->myPointerB->myStructInDIntINC.myInt7IND +
startPointer->myPointerA->myPointerB->myStructInDIntINC.myInt8IND +
startPointer->myPointerA->myPointerB->myStructInDIntINC.myInt9IND;
return myresult;
}
Variant B
typedef struct STRUCT_D{
int myInt1IND;
int myInt2IND;
int myInt3IND;
int myInt4IND;
int myInt5IND;
int myInt6IND;
int myInt7IND;
int myInt8IND;
int myInt9IND;
} STRUCT_D;
typedef struct STRUCT_C{
STRUCT_D myStructInDIntINC;
} STRUCT_C;
typedef struct STRUCT_B{
STRUCT_C * myPointerB;
} STRUCT_B;
typedef struct STRUCT_A{
STRUCT_B * myPointerA;
} STRUCT_A;
int square(void) {
struct STRUCT_C myStructC;
struct STRUCT_B myStructB;
struct STRUCT_A myStructA;
struct STRUCT_A* startPointer;
myStructC.myStructInDIntINC.myInt1IND = 55;
myStructB.myPointerB = &myStructC;
myStructA.myPointerA = &myStructB;
startPointer = &myStructA;
struct STRUCT_D myResultStruct = startPointer->myPointerA->myPointerB->myStructInDIntINC;
int myresult =
myResultStruct.myInt1IND + myResultStruct.myInt2IND + myResultStruct.myInt3IND +
myResultStruct.myInt4IND + myResultStruct.myInt5IND + myResultStruct.myInt6IND +
myResultStruct.myInt7IND + myResultStruct.myInt8IND + myResultStruct.myInt9IND;
return myresult;
}
I know that STRUCT_D is not fully initialized, but is for this example not relevant. My question would be if variant B is "better". Of course it is better readable, but does it make sense to save the context of a pointer. As I said in my file the same pointer is being dereferenced approximately 150 times in the same function. I know I know.. This function should definitely be refactored. :D
Upvotes: 0
Views: 335
Reputation: 364220
Copying data to a local can be useful to let compilers prove that no other accesses through other pointers read or write it.
So basically for the same reason you'd use int *restrict p
. If you use void func(struct foo *restrict ptr)
then you're promising the compiler that any access to ptr->member
is not going to change the value you read via any other pointer or from a global-scope variable.
Type-based alias analysis can already help significantly; accesses through a float*
can't affect any int
objects, for example. (Unless your program contains strict-aliasing UB; some compilers let you define that behaviour, e.g. gcc -fno-strict-aliasing
).
If you aren't doing assignments or reads through other pointers (which the compiler has to assume might be pointing to a member of a struct), it won't make a difference: alias analysis will succeed and let the compiler keep a struct member in a register across other accesses to memory, just like it could for a local.
(alias analysis is typically easy for locals, especially if they've never even had their address taken then nothing can be pointing to them.)
BTW, the reason the compiler is allowed to optimize away non-volatile
/ non-_Atomic
memory accesses is that it's undefined behaviour to write a non-atomic object at the same time another thread is reading or writing it.
That makes it safe to assume that variables don't change unless you write them yourself, and that you don't need the value in memory to be "in sync" with the C abstract machine except when you make non-inline function calls. (For any object that some unknown function might have pointers to. This is typically not the case for local vars like loop counters, so they can be kept in call-preserved registers instead of being spilled/reloaded.)
But there is a potential downside to declaring locals to hold copies of globals or pointed-to data: if the compiler doesn't end up keeping that local in a register for the whole function, it might end up having to actually copy the data into stack memory so it can reread from there. (If it can't prove that the original object is unchanged.)
Normally just favor readability over this level of micro-optimization, but have a look at the optimized asm for some platform you care about if you're curious. If there's a lot of unnecessary store/reload happening, then try using locals.
Upvotes: 0
Reputation: 15576
There would be no real difference, as any optimizing compiler (gcc, clang) would optimize this into a stack variable and/or a register.
Upvotes: 2