Reputation: 62106
There's an interesting optimization problem I'm facing.
In a large code base, consisting of a large number of classes, in many places the value of a non-constant global (=file scope) variable is very often used/examined and the unnecessary memory accesses of this variable are to be avoided.
This variable is initialized once, but because of the complexity of its initialization and the need to call a number of functions, it cannot be initialized like this, before execution of main()
:
unsigned size = 1000;
int main()
{
// some code
}
or
unsigned size = CalculateSize();
int main()
{
// some code
}
Instead it has to be initialized like this:
unsigned size;
int main()
{
// some code
size = CalculateSize();
// lots of code (statically/dynamically created class objects, whatnot)
// that makes use of "size"
return 0;
}
Just because size
isn't a constant and it is global (=file scope) and the code is large and complex, the compiler is unable to infer that size
never changes after size = CalculateSize();
. The compiler generates code that fetches and refetches the value of size
from the variable and can't "cache" it in a register or in a local (on-stack) variable that's likely to be in the CPU's d-cache together with other frequently accessed local variables.
So, if I have something like the following (a made-up example for illustrative purposes):
size = CalculateSize();
if (size > 200) blah1();
blah2();
if (size > 200) blah3();
The compiler thinks that blah1()
and blah2()
may change size
and it generates a memory read from size
in if (size > 200) blah3();
.
I'd like to avoid that extra read whenever and wherever possible.
Obviously, hacks like this:
const unsigned size = 0;
int main()
{
// some code
*(unsigned*)&size = CalculateSize();
// lots more code
}
won't do as they invoke undefined behavior.
The question is how to inform the compiler that it can "cache" the value of size
once size = CalculateSize();
has been performed and do it without invoking undefined behavior, unspecified behavior and, hopefully, implementation-specific behavior.
This is needed for C++03 and g++ (4.x.x). C++11 may or may not be an option, I'm not sure, I'm trying to avoid using advanced/modern C++ features to stay within the coding guidelines and predefined toolset.
So far I've only come up with a hack to create a constant copy of size
within every class that's using it and use the copy, something like this (decltype
makes it C++11, but we can do without decltype
):
#include <iostream>
using namespace std;
volatile unsigned initValue = 255;
unsigned size;
#define CACHE_VAL(name) \
const struct CachedVal ## name \
{ \
CachedVal ## name() { this->val = ::name; } \
decltype(::name) val; \
} _CachedVal ## name;
#define CACHED(name) \
_CachedVal ## name . val
class C
{
public:
C() { cout << CACHED(size) << endl; }
CACHE_VAL(size);
};
int main()
{
size = initValue;
C c;
return 0;
}
The above may only help up to a point. Are there better and more suggestive-to-the-compiler alternatives that are legal C++? Hoping for a minimally intrusive (source-code-wise) solution.
UPDATE: To make it a bit more clear, this is in a performance-sensitive application. It's not that I'm trying to get rid of unnecessary reads of that particular variable out of whim. I'm trying to let/make the compiler produce more optimal code. Any solution that involves reading/writing another variable as often as size
and any additional code in the solution (especially with branching and conditional branching) executed as often as size
is referred to is also going to affect the performance. I don't want to win in one place only to lose the same or even more in another place.
Here's a related non-solution, causing UB (at least in C).
Upvotes: 3
Views: 231
Reputation: 5101
#include <iostream>
unsigned calculate() {
std::cout<<"calculate()\n";
return 42;
}
const unsigned mySize() {
std::cout<<"mySize()\n";
static const unsigned someSize = calculate();
return someSize;
}
int main() {
std::cout<<"main()\n";
mySize();
}
prints:
main()
mySize()
calculate()
on GCC 4.8.0
Checking for whether it has been initialized already or not will be almost fully mitigated by the branch predictor. You will end up having one false and a quadrillion trues afterwards.
Yes, you will still have to access that state after the pipeline has been basically built, potentially wreaking havoc in the caches, but you can't be sure unless you profile. Also, compiler can likely do some extra magic for you (and it is what you're looking for), so I suggest you first compile and profile with this approach before discarding it entirely.
Upvotes: 0
Reputation: 991
what of:
const unsigned getSize( void )
{
static const unsigned size = calculateSize();
return size;
}
This will delay the initialization of size until the first call to getSize(), but still keep it const.
GCC 4.8.2
Upvotes: 0
Reputation: 4012
There's the register
keyword in C++ which tells the compiler you plan on using a variable a lot. Don't know about the compiler you're using, but most of the modern compilers do that for the users, adding a variable into the registry if needed. You can also declare the variable as constant and initialize it using const_cast
.
Upvotes: 2