Reputation: 21818

clang: Force loop unroll for specific loop

Is there a way to tell clang to unroll a specific loop?

Googling for an answer gives me command-line options which will affect the whole compilant and not a single loop.

There is a similar question for GCC --- Tell gcc to specifically unroll a loop --- but the answer provided there does not work with clang.

Option 1 suggested there:

#pragma GCC optimize ("unroll-loops")

seems to be silently ignored. In fact

#pragma GCC akjhdfkjahsdkjfhskdfhd

is also silently ignored.

Option 2:

__attribute__((optimize("unroll-loops")))

results in a warning:

warning: unknown attribute 'optimize' ignored [-Wattributes]

Update

joshuanapoli provides a nice solution how to iterate via template metaprogramming and C++11 without creating a loop. The construct will be resolved at compile-time resulting in a repeatedly inlined body. While it is not exactly an answer to the question, it essentially achieves the same thing.

That is why I am accepting the answer. However, if you happen to know how to use a standard C loop (for, while) and force an unroll it - please share the knowledge with us!

Upvotes: 11

Answers (4)

Tom 7

Reputation: 527

In C++17 and later, you can write a more straightforward (to me) version of joshuanapoli's template approach, using if constexpr:

template<std::size_t N, class F, std::size_t START = 0>
inline void repeat(const F &f) {
  if constexpr (N == 0) {
    return;
  } else {
    f(START);
    repeat<N - 1, F, START + 1>(f);
  }
}

This version does not need additional () at invocation time:

  int accumulator = 3;
  repeat<4>([&](std::size_t x) {
    accumulator += x;
  });

Upvotes: 3

Jingyue Wu

Reputation: 191

Clang recently gained loop unrolling pragmas (such as #pragma unroll) which can be used to specify full/partial unrolling. See http://clang.llvm.org/docs/AttributeReference.html#pragma-unroll-pragma-nounroll for more details.

Upvotes: 5

joshuanapoli

Reputation: 2509

For a C++ program, you can unroll loops within the language. You won't need to figure out compiler-specific options. For example,

#include <cstddef>
#include <iostream>

template<std::size_t N, typename FunctionType, std::size_t I>
class repeat_t
{
public:
  repeat_t(FunctionType function) : function_(function) {}
  FunctionType operator()()
  {
    function_(I);
    return repeat_t<N,FunctionType,I+1>(function_)();
  }
private:
  FunctionType function_;
};

template<std::size_t N, typename FunctionType>
class repeat_t<N,FunctionType,N>
{
public:
  repeat_t(FunctionType function) : function_(function) {}
  FunctionType operator()() { return function_; }
private:
  FunctionType function_;
};

template<std::size_t N, typename FunctionType>
repeat_t<N,FunctionType,0> repeat(FunctionType function)
{
  return repeat_t<N,FunctionType,0>(function);
}

void loop_function(std::size_t index)
{
  std::cout << index << std::endl;
}

int main(int argc, char** argv)
{
  repeat<10>(loop_function)();
  return 0;
}

Example with complicated loop function

template<typename T, T V1>
struct sum_t
{
  sum_t(T v2) : v2_(v2) {}
  void operator()(std::size_t) { v2_ += V1; }
  T result() const { return v2_; }
private:
  T v2_;
};

int main(int argc, char* argv[])
{
  typedef sum_t<int,2> add_two;
  std::cout << repeat<4>(add_two(3))().result() << std::endl;
  return 0;
}
// output is 11 (3+2+2+2+2)

Using a closure instead of an explicit function object

int main(int argc, char* argv[])
{
  int accumulator{3};
  repeat<4>( [&](std::size_t)
  {
    accumulator += 2;
  })();
  std::cout << accumulator << std::endl;
}

Upvotes: 9

EHuhtala

Reputation: 587

As gross as it may be, you could isolate said for-loop into its own file, compiling it seperately (with its own command line flags).

relevant, but currently unanswered clang-developers question

Upvotes: 2

clang: Force loop unroll for specific loop

Answers (4)

Related Questions