Reputation: 518
I'm developing a c++11 application with gcc 5.4.0 2. In this application I have the following template:
template <class T1, class T2, class T3>
class Operator
{
T3* op1(T1* operand1, T2* operand2);
T3* op2(T1* operand1, T2* operand2);
T3* op3(T1* operand1, T2* operand2);
T3* op4(T1* operand1, T2* operand2);
//...
T3* opn(T1* operand1, T2* operand2);
};
Within op1, op2, ... opn, I need to do a bunch of stuff with arrays (expected to have hundreds of millions of elements, potentially). Like arithmetic, comparisons, copies, etc. I choose to use templates because I'd like to have constructs like:
#pragram omp parallel for
for(int64_t i = 0; i < length; i++)
{
r[i] = operand1[i] /*operations here*/ operand2[i]
}
For performance reasons, makes no sense to check types within the for loop with nested ifs. And since I want to support many types, like (int8_t, int16_t, int32_t, int64_t, float and double, and possibly unsigned too), my code would be too bloated anyways if for each operation I created a bunch of for loops for each combination of types.
The problems is, if I want to support something around 6 to 10 types, the compiler needs to generate up 10^3 versions of the code. Since it would suffice for all possible combinations of types for T1, T2 and T3.
Generating, and compiling all this is taking a long time. So, I am looking for an alternative that doesn't take this much time to compile and doesn't add too much overhead. I was thinking about using polymorphism, but I don't know how to have a similar result, mainly because of the types overload, I'd need the operators: =, +, -, *, /, >, <, etc, too work somehow with all this types, which I get for "free" with templates.
Any pointers are much appreciated.
EDIT: My application is going to process user's arrays, and it will maintain the user's data as it is (type and layout). These arrays can be of any type. So I might need to add int64_t and doubles or any combination of different types. For flexibility, I have a code path in my application that instantiates all possible 1000 combinations (I did this by using recursive macros). So I am really looking for a way, either to reduce compiling time even for this much templates, or to change it into a polymorphic construct.
EDIT2: After some adjustments (little improvements in my generation macro), I am able to compile with O3 in about 30 min. This is ok for deploying and distribution, but it very bad for development. Thus, I'll set an debug flag and compile with reduced type supporting (only 4 or 6), which will reduce compilation time drastically. Thanks for all the input.
Upvotes: 4
Views: 331
Reputation: 18041
If all these functions are going to be instantiated, the only solution to avoid code bloat is, if that is possible, to find a common type to perform the operations on the common type and then convert back to the specific type:
common_type operator +(common_type,common_type);
template<class T1,class T2,class T3>
inline T1 operator +(T2 a, T3 b){
return T1{common_type{a}+common_type{b}};
}
Using category theory jargon, you need to find, if it exists, a morphism from each of the class you defined (considering they are categories representation) to the common_type
class. This is not easy task and may be impossible.
Maybe something intermediary could be possible T1 operator+(common_type,common_type); ...
.
Upvotes: 2