Reputation: 307
In this piece of code, I access exitButton.rectangle several times consecutively, and I wondered if it's optimized so the object code produced doesn't have to ask exitButton
for rectangle
everytime:
struct MenuItem {
Rectangle rectangle;
};
MenuItem exitButton;
exitButton.rectangle.top = 383;
exitButton.rectangle.height = 178;
exitButton.rectangle.left = 0;
exitButton.rectangle.width = 1024;
Do I need to write something like this instead to guarantee it is optimized?
Rectangle &tempRectangle = exitButton.rectangle;
tempRectangle.top = 383;
tempRectangle.height = 178;
tempRectangle.left = 0;
tempRectangle.width = 1024;
Would it be the same but using a class instead a struct? Thanks in advance.
EDIT
g++ -o0, no reference:
CPU Disasm
Address Hex dump Command Comments
004013B0 /$ 55 PUSH EBP ; CppTest.004013B0(guessed void)
004013B1 |. 89E5 MOV EBP,ESP
004013B3 |. 83E4 F0 AND ESP,FFFFFFF0 ; DQWORD (16.-byte) stack alignment
004013B6 |. 83EC 10 SUB ESP,10
004013B9 |. E8 42060000 CALL 00401A00 ; [CppTest.00401A00
004013BE |. C70424 7F0100 MOV DWORD PTR SS:[LOCAL.4],17F
004013C5 |. C74424 04 B20 MOV DWORD PTR SS:[LOCAL.3],0B2
004013CD |. C74424 08 000 MOV DWORD PTR SS:[LOCAL.2],0
004013D5 |. C74424 0C 000 MOV DWORD PTR SS:[LOCAL.1],400
004013DD |. B8 00000000 MOV EAX,0
004013E2 |. C9 LEAVE
004013E3 \. C3 RETN
g++ -o0, reference:
CPU Disasm
Address Hex dump Command Comments
004013B0 /$ 55 PUSH EBP ; CppTest.004013B0(guessed void)
004013B1 |. 89E5 MOV EBP,ESP
004013B3 |. 83E4 F0 AND ESP,FFFFFFF0 ; DQWORD (16.-byte) stack alignment
004013B6 |. 83EC 20 SUB ESP,20
004013B9 |. E8 62060000 CALL 00401A20 ; [CppTest.00401A20
004013BE |. 8D4424 0C LEA EAX,[LOCAL.5]
004013C2 |. 894424 1C MOV DWORD PTR SS:[LOCAL.1],EAX
004013C6 |. 8B4424 1C MOV EAX,DWORD PTR SS:[LOCAL.1]
004013CA |. C700 7F010000 MOV DWORD PTR DS:[EAX],17F
004013D0 |. 8B4424 1C MOV EAX,DWORD PTR SS:[LOCAL.1]
004013D4 |. C740 04 B2000 MOV DWORD PTR DS:[EAX+4],0B2
004013DB |. 8B4424 1C MOV EAX,DWORD PTR SS:[LOCAL.1]
004013DF |. C740 08 00000 MOV DWORD PTR DS:[EAX+8],0
004013E6 |. 8B4424 1C MOV EAX,DWORD PTR SS:[LOCAL.1]
004013EA |. C740 0C 00040 MOV DWORD PTR DS:[EAX+0C],400
004013F1 |. B8 00000000 MOV EAX,0
004013F6 |. C9 LEAVE
004013F7 \. C3 RETN
Upvotes: 2
Views: 160
Reputation: 3350
In this particular case there actually isn't any optimization for the compiler to perform. In fact, the compiler may actually have to work harder in the second case to produce equally efficient code; because for that it will have to resolve reference aliases.
The reason is because the Rectangle
is not a pointer, but rather has been directly embedded into the MenuItem
. In such a case, the compiler actually sees the whole structure tree as a flat set of variables. The compiler thinks of things in terms of bytes offset from the start of the struct. Example:
struct Item1 {
int i1, i2, i3;
};
struct Item2 {
Item1 item1;
int t1, t2;
};
... is structurally equivalent internally to:
struct ItemAll {
int i1, i2, i3;
int t1, t2;
};
You could actually use a static cast between Item2
and ItemAll
in this example. In either case, if you reference either ItemAll.i2
or Item2::Item1.i2
the compiler views it internally as variable_base_address + sizeof(int)
. The same applies to classes as well as structs.
What you need to be concerned about is when you're using the -> operator
, eg if your struct was designed like so:
struct MenuItem {
Rectangle* rectangle;
};
In this case, the compiler must do an extra dereferencing step to access contents of Rectangle
. With optimizations enabled, any modern compiler will optimize the dereferences as best it can. That's not a problem. Where a problem can occur is if you have a lot of interleaving dereferences that simply exceed available registers on the cpu:
struct MenuItem {
Rectangle* rect1;
Rectangle* rect2;
Point* point1;
Point* point2;
};
menuItem.rect1->top = 50;
menuItem.rect2->top = 50;
menuItem.point1->x = 33;
menuItem.point2->x = 45;
menuItem.rect1->bottom = 450;
menuItem.rect2->bottom = 450;
// etc...
In the above case the compiler may run out of registers suitable for use as base registers, forcing it to redundantly re-calculate the dereference for later uses (it won't run out of registers in this specific example since these are assigning immediates, but if any variable arithmetic were involved the chance becomes much higher). Obviously in this case also it would be trivial for you to reorder things so that all rect1 assignments are paired, etc. Though again, if we were doing something more complex than simple assignment then that might not be possible either.
In this case however, even using references won't help. The compiler will have to spill and re-load the pointer dereferences in the same fashion regardless.
Conclusion: Feel free to use Rectangle& var = menuItem.var;
as a code simplification tool. It can make your code easier to maintain and reduce some of your typing! But it won't help the compiler do its job, so if that's your aim, don't bother.
Upvotes: 3
Reputation: 490058
Taking the second question first, class vs. struct will make no difference (at least with any sane compiler). The only difference between the two is the default accessibility of members (private for a class, public for a struct).
About the only way to be sure about optimization is to look at the object code, but I'd expect any reasonable compiler to determine that you're using the same series of references to get to exitButton.rectangle
repeatedly, and automatically use a register to refer to that for the successive accesses.
If you turn off all optimization, that might not be the case, but with essentially any optimization allowed at all, you can expect it -- this optimization has been well known for years (it's basically a common subexpression elimination).
Upvotes: 3