Reputation: 319
I have several segments of code in my c++ project which look something like this:
system<0>::global_instance.do_something();
system<1>::global_instance.do_something();
system<2>::global_instance.do_something();
system<3>::global_instance.do_something();
//...
system<29>::global_instance.do_something();
system<30>::global_instance.do_something();
system<31>::global_instance.do_something();
It seems a bit repetitive, right? It would be great if I could just do something like this:
for(int i = 0; i < 32; i++)
{
system<i>::global_instance.do_something();
}
Unfortunately, that won't work because the value of i
isn't known at compile time and therefore cannot be used as a template parameter for system
.
What I need is a way to expand or unroll loops at compile time.
I've seen some implementations which use templates to achieve unrolling, but they don't work for what I'm trying to do. Ideally, the loop unrolling would take place during preprocessing, and could take any arbitrary statement as input.
For example, this:
#unroll NUM 0 5
foo<NUM>();
#endunroll
Would get translated into this:
foo<0>();
foo<1>();
foo<2>();
foo<3>();
foo<4>();
foo<5>();
And this:
#unroll NUM 0 5
blah blah blah NUM
#endunroll
Would get translated into this:
blah blah blah 0
blah blah blah 1
blah blah blah 2
blah blah blah 3
blah blah blah 4
blah blah blah 5
Even though blah blah blah NUM
is a syntactically incorrect statement, it should still get copied and every instance of the token NUM should get replaced by the appropriate number value.
Is there any way to achieve this in vanilla c++? Or are there any special c++ compilers which add this functionality?
Edit: I found an answer from @HTNW in the comments.
#include <iostream>
using namespace std;
template<auto begin, auto end>
inline void unroll(auto f)
{
if constexpr(begin < end)
{
f.template operator()<begin>();
unroll<begin + 1, end>(f);
}
}
template<int NUM>
struct thingy
{
static void print(){cout << NUM << endl;}
};
int main()
{
unroll<0,8>([]<int i>()
{
thingy<i>::print();
});
}
Output:
0
1
2
3
4
5
6
7
The line which makes this work is f.template operator()<begin>();
. I have no idea what this line means and I've never seen it before in my life. If someone could explain it to me, that would be great.
Upvotes: 3
Views: 1313
Reputation: 3101
I use this C++20 unrolling-helpers:
#pragma once
#include <utility>
#include <concepts>
#include <iterator>
template<size_t N, typename Fn>
requires (N >= 1) && requires( Fn fn, size_t i ) { { fn( i ) } -> std::same_as<void>; }
inline
void unroll( Fn fn )
{
auto unroll_n = [&]<size_t ... Indices>( std::index_sequence<Indices ...> )
{
(fn( Indices ), ...);
};
unroll_n( std::make_index_sequence<N>() );
}
template<size_t N, typename Fn>
requires (N >= 1) && requires( Fn fn ) { { fn() } -> std::same_as<void>; }
inline
void unroll( Fn fn )
{
auto unroll_n = [&]<size_t ... Indices>( std::index_sequence<Indices ...> )
{
return ((Indices, fn()), ...);
};
unroll_n( std::make_index_sequence<N>() );
}
template<size_t N, typename Fn>
requires (N >= 1) && requires( Fn fn, size_t i ) { { fn( i ) } -> std::convertible_to<bool>; }
inline
bool unroll( Fn fn )
{
auto unroll_n = [&]<size_t ... Indices>( std::index_sequence<Indices ...> ) -> bool
{
return (fn( Indices ) && ...);
};
return unroll_n( std::make_index_sequence<N>() );
}
template<size_t N, typename Fn>
requires (N >= 1) && requires( Fn fn ) { { fn() } -> std::convertible_to<bool>; }
inline
bool unroll( Fn fn )
{
auto unroll_n = [&]<size_t ... Indices>( std::index_sequence<Indices ...> ) -> bool
{
return ((Indices, fn()) && ...);
};
return unroll_n( std::make_index_sequence<N>() );
}
template<std::size_t N, typename RandomIt, typename UnaryFunction>
requires std::random_access_iterator<RandomIt>
&& requires( UnaryFunction fn, typename std::iterator_traits<RandomIt>::value_type elem ) { { fn( elem ) }; }
inline
RandomIt unroll_for_each( RandomIt begin, RandomIt end, UnaryFunction fn )
{
RandomIt &it = begin;
if constexpr( N > 1 )
for( ; it + N <= end; it += N )
unroll<N>( [&]( size_t i ) { fn( it[i] ); } );
for( ; it < end; ++it )
fn( *begin );
return it;
}
But be aware that unrolling isn't beneficial in most cases. But sometime it is: I optimized fletcher's hash with unroll_for_each and got 95% more througput on my machine. But the unrolling-factor was very crucial then: with an unrolling factor of 5 I got the mentioned 95%, with an unrolling-factor of 5 the code had the same performance than without unrolling.
Upvotes: 0
Reputation: 25388
You can make use of std::index_sequence
and a fold expression, for example:
#include <iostream>
#include <utility>
template <size_t x>
void foo ()
{
std::cout << x << " ";
}
template <size_t... indices>
void unroll (std::index_sequence <indices...>)
{
(foo <indices> (), ...);
}
int main ()
{
unroll (std::make_index_sequence <32> {});
}
Output:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Update: As per HTNW's comment, in C++20 you can pass a template lambda as an additional argument to unroll
to avoid having to hard code a call to foo
:
#include <iostream>
#include <utility>
template <size_t x>
void foo ()
{
std::cout << x << " ";
}
template <size_t... indices>
void unroll (auto f, std::index_sequence <indices...>)
{
(f.template operator () <indices> (), ...);
}
int main ()
{
unroll ([] <size_t i> () { foo <i> (); }, std::make_index_sequence <32> {});
}
Upvotes: 7
Reputation: 1163
I am not sure if this is good for you but you can do something like this in c++17
:
template <size_t curr, size_t max>
void unroll(){
doSomething<curr>();
if constexpr (curr < max)
unroll<curr+1,max>();
}
Or as HTNW pointed out you can do:
template <size_t curr, size_t max>
void unroll(auto f){
f.template operator()<curr>(); // See below
if constexpr (curr < max)
unroll<curr + 1,max>(f);
}
I had to adjust the suggestion, but this should work on c++20
. You can then call this as:
unroll<0,20>([]<size_t i> () { system<curr>::do_something(); });
As to why you need f.template operator()<curr>();
see this.
Upvotes: 2