Andrew
Andrew

Reputation: 867

How do I refactor this loop using template metaprogramming?

I am new to template meta-programming but I'm trying to refactor some matrix manipulation code for a speed boost. In particular, right now my function looks like this:

template<int SIZE> void do_something(matrix A) {
   for (int i = 0; i < SIZE; ++i) {
      // do something on column i of A
   }
}

I saw some techniques that use templates to rewrite this as

#define SIZE whatever
template<int COL> void process_column(matrix A) {
   // do something on column COL of A
   process_column<COL + 1>(A);
}
template<> void process_column<SIZE - 1>(matrix A) { 
   return; 
}
void do_something(matrix A) {
   process_column<0>(A);
}

When I did that to my function and set compiler flags to inline appropriately, I saw a pretty decent (~10%) speed boost. But the problem is that SIZE is #defined not a template parameter and I will definitely be using different sizes in my program. So I want something like

template<int COL, int SIZE> void process_column(matrix A) {
   // do something on column COL of A
   process_column<COL + 1, SIZE>(A);
}
/* HOW DO I DECLARE THE SPECIFIC INSTANCE???? 
   The compiler rightfully complained when I tried this: */
template<int SIZE> void process_column<SIZE - 1, SIZE>(matrix A) { 
   return; 
}
template<int SIZE> void do_something(matrix A) {
   process_column<0, SIZE>(A);
}

How do I declare the specific instance to get the loop to terminate? Thanks in advance!

Upvotes: 0

Views: 147

Answers (1)

Jarod42
Jarod42

Reputation: 217245

You cannot partially specialize a template function
but you can for template class.

Following may help you:

namespace detail {

    template<int COL, int SIZE> 
    struct process_column
    {
        static void call(matrix& A) {
            // do something on column COL of A
            process_column<COL + 1, SIZE>::call(A);
        }
    };

    template<int SIZE>
    struct process_column<SIZE, SIZE> // Stop the recursion
    {
        static void call(matrix& A) { return; }
    };

} // namespace detail

template<int SIZE> void do_something(matrix& A) {
   detail::process_column<0, SIZE>::call(A);
}

An alternative with C++11:

#if 1 // Not in C++11, but present in C++1y
#include <cstdint>

template <std::size_t ...> struct index_sequence {};

template <std::size_t I, std::size_t ...Is>
struct make_index_sequence : make_index_sequence<I - 1, I - 1, Is...> {};

template <std::size_t ... Is>
struct make_index_sequence<0, Is...> : index_sequence<Is...> {};

#endif

namespace details {
    template <template <std::size_t> class T, std::size_t ... Is, typename ... Args>
    void for_each_column_apply(const index_sequence<Is...>&, Args&&...args)
    {
        int dummy[] = {(T<Is>()(std::forward<Args>(args)...), 0)...};
        static_cast<void>(dummy); // remove warning for unused variable
    }
} // namespace details


template <template <std::size_t> class T, std::size_t N, typename ... Args>
void for_each_column_apply(Args&&... args)
{
    details::for_each_column_apply<T>(index_sequence<N>(), std::forward<Args>(args)...);
}

Usage:

class Matrix {};

template <std::size_t COL>
struct MyFunctor
{
    void operator() (Matrix&m /* other needed args*/) const
    {
        // Do the job for Nth column
    }
};

int main() {
    constexpr SIZE = 42;
    Matrix m;
    for_each_column_apply<MyFunctor, SIZE>(m /* other args needed by MyFunctor*/);

    return 0;
}

Upvotes: 1

Related Questions