Reputation: 8538
If I have an example function like:
void func1(float a, float b, float c)
{
setA(a);
setB(b);
setC(c);
}
Which calls inlined functions:
inline void setA(float a){ m_a = a; m_isValid = false; }
inline void setB(float b){ m_b = b; m_isValid = false; }
inline void setC(float c){ m_c = c; m_isValid = false; }
Should I care about the "m_isValid = false" duplications or the compiler eliminates them by the optimization?
Upvotes: 5
Views: 577
Reputation: 379
Most of modern compilers (with enabled optimization option) should make it!
Upvotes: 0
Reputation: 300209
Yes, this is commonly known as Dead Store Elimination (read = load and write = store in compilers parlance).
In general, any useless operation can be optimized away by the compiler providing it can prove that you (the user) cannot notice it (within the bounds set up by the language).
For Dead Store Elimination in particular it is generally restricted to:
Some examples:
struct Foo { int a; int b; };
void opaque(Foo& x); // opaque, aka unknown definition
Foo foo() {
Foo x{1, 2};
x.a = 3;
return x; // provably returns {3, 2}
// thus equivalent to Foo foo() { return {3, 2}; }
}
Foo bar() {
Foo x{1, 2};
opaque(x); // may use x.a, so need to leave it at '1' for now
x.a = 3;
return x;
}
Foo baz() {
Foo x{1, 2};
opaque(x);
x.a = 1; // x.a may have been changed, cannot be optimized
return x;
}
Note that whether you store the same value consecutively or not has not importance, as long as the compiler can prove that a variable is not read between two store operations, it can eliminate the first safely.
A special case: by specification in C++, load/store to a volatile
cannot be optimized. This is so because volatile
was specified to allow interactions with the hardware, and thus the compiler cannot know a priori whether the hardware will read or write to the variable behind the program's back.
Another special case: for the purpose of optimizations, memory synchronization operations (fences, barriers, etc...) used in multi-threaded programs can also prevent this kind of optimizations. This is because, pretty much like in the volatile
case, the synchronization mean that another thread of execution may have modified the variable behind this thread's back.
Finally like all optimizations its effectiveness greatly depends on the knowledge of the context. If it is proven that opaque
either does not read or does not write to x.a
, then some stores may be optimized out (provable if the compiler can inspect the definition of opaque
), so in general it really depends on inlining and constant propagation.
Upvotes: 10
Reputation: 114579
A decent compiler should remove them in this specific case.
Completing to a full compiling example
struct Foo {
float m_a, m_b, m_c;
bool m_isValid;
void setA(float a){ m_a = a; m_isValid = false; }
void setB(float b){ m_b = b; m_isValid = false; }
void setC(float c){ m_c = c; m_isValid = false; }
void func1(float a, float b, float c);
};
Foo f;
void func1(float a, float b, float c)
{
f.setA(a);
f.setB(b);
f.setC(c);
}
g++ in this case compiles func1
to
_Z5func1fff:
.LFB3:
.cfi_startproc
movl 4(%esp), %eax ;; loads a
movb $0, f+12 ;; clears m_isValid
movl %eax, f ;; stores m_a
movl 8(%esp), %eax ;; loads b
movl %eax, f+4 ;; stores m_b
movl 12(%esp), %eax ;; loads c
movl %eax, f+8 ;; stores m_c
ret
.cfi_endproc
Note that while it's true that you should keep an eye out about how to design a program if performance is an issue, this kind of micro-level optimization is best done at the end, after measuring where the code is actually losing time.
Upvotes: 5