reinearthed
reinearthed

Reputation: 45

How can i unroll the loop using metaprogramming C++?

In fact, i want to evaluate a dot product of 2 arrays. And when i try this

template <int N, typename ValueType>
struct ScalarProduct {
    static ValueType product (ValueType* first, ValueType* second) {
        return ScalarProduct<N-1, ValueType>::product(first + 1, second + 1) 
            + *first * *second;
    }
};

template <typename ValueType>
struct ScalarProduct<0, ValueType> {
    static ValueType product (ValueType* first, ValueType* second) {
        return 0;
    }

then time to compute in runtime is less than during compilation

Upvotes: 2

Views: 919

Answers (3)

shibumi
shibumi

Reputation: 378

Firstly, you are writing functions. So meta programming or not, the compiler is going to generate functions. And since functions are not going to be evaluated till runtime your approach is not going to decrease run-time. In fact, it might add a bit of overhead as you are unrolling a for-loop into a recursive function call.

To answer a more generic question, with template meta-programming you can only compute stuff at compile time. A standard way is to pre-compute the values you want and store them as members in an object. And you can only use types like enum (ones that don't need a constructor) to compute stuff at compile time as all constructor calls are executed at run-time.

Meta-programming in most cases is not practical. You can use it as a good tool to learn about templates but it results in large binaries and an unmaintainable code base. So I'd advise you not to use it unless you have explored other options like look-up tables.

You can only work with arbitrary arrays if they are already defined in your code. For example

int a1[] = {1,2,3};
int a2[] = {2,4,5};

template <int N,typename T>
struct foo {
  int product;
  foo<N-1,T> rest;
  foo(const T* array1,const T* array2) : rest(array1+1,array2+1) { product = array1[0] * array2[0] + rest.product; }
};

template <0,typename T>
struct foo {
  int product;
  // These addresses are stale, so don't use them
  foo(cons T* array1, const T* array2) : product(0) {}
};

foo<3,int> myfoo(a1,a2);

And you can have myfoo.product to get the value of the cross-product of a1 and a2 computed at compile time.

Upvotes: 3

David Seiler
David Seiler

Reputation: 9705

If you want to understand why two pieces of code perform differently, you need a profiler.

If I had to guess, I'd say that your clever recursive template expansion is producing code that is too difficult for your compiler to optimize efficiently. Loops over floating-point arrays are maybe the most aggressively optimized construct in C++; your compiler may even have special cases explicitly for scalar products. Certainly it's capable of unrolling its own loops, if it wants to.

In simple matters like these, it's best not to try to trick the compiler into performing the optimizations you want. It understands scalar products better than you do, and is better-equipped to solve your problem efficiently. Really!

Or maybe I'm completely wrong, and something else entirely is to blame. Perhaps your benchmark is at fault, perhaps your sample sizes are too small. Perhaps any number of things. You should not try to guess why this strange result is happening; you should apply a profiler, and investigate.

Good luck.

Upvotes: 0

user405725
user405725

Reputation:

Compiler may do it for you, but it may not. You have to look at the code generated by the compiler to figure out whether it does it or not. If for some reason it does not, your only option is not to use recursive template and come up with alternative solution, i.e. lookup table, run-time loop etc.

Upvotes: 0

Related Questions