Reputation: 7149
I've profiled my application to see why my 3D-vector implementation is almost 3 times slower than the corresponding C function calls: the results proved that every single function call costs more time than the actual arithmetic being performed! I've already cut the number of function calls down to 3, but that doesn't help a lot.
It seems that for some reason, the arithmetic operator calls take even more time than other function calls, and, looking at the disassembly, I've found out why: they are the only functions that haven't been inlined in spite of full optimization! Each call takes ~10 commands of preparation, just for storing the two operands. compared to that, calling the corresponding C function only takes 2 commands to store each double pointer argument.
here's a simplified segment of my code (add include guards as needed):
// header vector3d.h
class VectorExpression3d;
class Vector3d {
public: // will see about visibility later...
double x, y, z;
Vector3d(const VectorExpression3d& ve);
Vector3d& operator=(const VectorExpression3d& ve);
};
#include "vectorexpression3d.h"
// implementation ...
// header vectorexpression3d.h
#include "vector3d.h"
class VectorExpression3d {
public:
double x, y, z, scale;
VectorExpression3d(const Vector3d& v1, const Vector3d& v2)
: x(v1.x+v2.x), y(v1.y+v2.y), z(v1.z+v2.z), scale(1.0) {}
};
// main cpp file
#include "vector3d.h"
inline VectorExpression3d operator+(const Vector3d& v1, const Vector3d& v2) {
return VectorExpression3d(v1, v2);
}
int main() {
// code
Vector3d v1, v2, v3;
v3 = v1+v2; // invokes non-inlined call to operator+ above,
// then inlined(!) VectorExpression3d constructor
// then inlined(!) Vector3d constructor
// then inlined VectorExpression3d destructor
// ...
}
I'm using VS 2010, and it seems the compiler ignores inline statements to any of the operators. I know that I cannot force inlining - but it should be possible, and since the operators are trivial, it should even be easy! So what is the problem? Why doesn't VS 2010 inline my operators? Is it not possible after all?
According to my profiling results, the call to operator+ by itself uses up more than half of the total time of the addition statement, including the assignment and construction/Destruction of a temporary!
P.S.: Maybe this is important, but I forgot to mention that the actual classes are, in fact, templates (only template arg is the base type (double) so far, so not a biggie)
Upvotes: 0
Views: 1606
Reputation: 36597
Any member function function of a class - including operators - that are defined within the class definition can be inlined.
As with any use of inline functions (e.g. within a class definition, or functions explicitly declared inline
), the compiler is not obligated to actually inline the function. Compilers are permitted to treat inlining as a hint, and then to ignore that hint. Compilation settings (e.g. optimisation options) can make the compiler more or less aggressive about inlining. Some compilers refuse to inline certain functions (e.g. some refuse to inline a function with a switch
statement).
Some modern compilers are even smart enough in some circumstances to inline a function that the programmer has not specified as being inline. In practice, modern compilers are often able to make better choices about inlining (e.g. to exploit capabilities of the host system) than most programmers do.
Upvotes: 2