Reputation: 4314

Is there standard C++ template class for variable-length arrays with maximum size for good cache locality?

I'm developing a software where cache locality is king. It uses about 10 different C arrays in complex data structures that have #define MAX_ARRAY_1_SIZE 1000 like defines. The number of elements in this example 1 array can be then between 0-1000. There are arrays of structs within arrays of structs. The amount of memory used is about 163 megabytes.

However, I noticed C's qsort is slow so I switched to C++ which has much faster std::sort due to inlining the comparator function call. The wonders of template metaprogramming.

Now I'm wondering whether I should use C++'s more advanced features. I tried std::vector in just two of the about 10 arrays, only to find out it kills my cache locality, leading to 15% performance reduction when only two arrays of the about 10 were replaced by std::vector. Yes, I did find out it was due to cache locality: during the critical code path, the array size remains unchanged and it's only populated at program startup time.

Then I tried to see whether there is something that stores the elements in-line, not in a separate dynamically allocated block. I found std::array. However, std::array is a fixed size array and the benefit over C arrays is questionable.

I'm looking for a variable-sized array with maximum size and inline element storage. Is there such a thing in standard C++? Or in Boost?

It doesn't hurt if the variable-sized array actually could have infinite maximum size, so that it stores data in an in-line array if the data fits there, and resorts to using an allocated array if the data doesn't fit in the in-line array.

Now, an experienced C++ programmer could write such a template class in few hours, and I could do the same in few days due to using C++ last time about 10 years ago before I had to obtain the std::sort performance increase now. But I don't want to reinvent the wheel, so I'm looking for existing solutions.

Edit:

To make myself clear, let me provide an example:

struct baz {
  int a;
  int b;
};

struct bar {
  struct baz baz[100];
};

struct foo {
  struct bar bar[100];
};

Now struct foo is one contiguous block of memory.

struct baz {
  int a;
  int b;
};

struct bar {
  std::vector<struct baz> baz;
};

struct foo {
  std::vector<struct bar> bar;
};

...now it isn't.

Upvotes: 2

Answers (2)

juhist

Reputation: 4314

I decided to implement my own:

template<class C, class sz_t, sz_t maxsz> class inlinearray {
  private:
    typedef C value_type;
    typedef value_type *pointer;
    typedef const value_type *const_pointer;
    typedef value_type &reference;
    typedef const value_type &const_reference;
    typedef value_type *iterator;
    typedef const value_type *const_iterator;
    typedef sz_t size_type;
    typedef std::ptrdiff_t difference_type;
    typedef std::reverse_iterator<iterator> reverse_iterator;
    typedef std::reverse_iterator<const_iterator> const_reverse_iterator;
    sz_t sz;
    union {
      C realarray[maxsz]; // for correct alignment
      char array[maxsz*sizeof(C)];
    };
  public:
    inlinearray()
    {
      sz = 0;
    }
    ~inlinearray(void)
    {
      clear();
    }
    void clear(void)
    {
      sz_t i;
      for (i = 0; i < sz; i++)
      {
        data()[i].~C();
      }
      sz = 0;
    }
    template<class sz2_t, sz2_t maxsz2> inlinearray(inlinearray<C,sz2_t,maxsz2> that)
    {
      size_t i;
      sz = that.sz;
      for (i = 0; i < sz; i++)
      {
        push_back(that[i]);
      }
    }
    template<class sz2_t, sz2_t maxsz2> void operator=(inlinearray<C,sz2_t, maxsz2> val2)
    {
      swap(val2);
    }
    void fill(const C& val)
    {
      std::fill_n(begin(), size(), val);
    }
    C &operator[](sz_t i) noexcept
    {
      return data()[i];
    }
    constexpr const C &operator[](sz_t i) const noexcept
    {
      return data()[i];
    }
    C at(sz_t i)
    {
      if (i >= sz)
      {
        throw std::out_of_range("inlinerray::at() out of range");
      }
      return data()[i];
    }
    constexpr const C at(sz_t i) const
    {
      if (i >= sz)
      {
        throw std::out_of_range("inlinerray::at() out of range");
      }
      return data()[i];
    }
    void push_back(const C &c)
    {
      if (sz >= maxsz)
      {
        abort();
      }
      new (data()+sz) C(c);
      sz++;
    }
    void pop_back() noexcept
    {
      data()[sz-1].~C();
      sz--;
    }
    template <class sz2_t, sz2_t maxsz2> void swap(inlinearray<C, sz2_t, maxsz2> &that)
    {
      if (that.sz > maxsz)
      {
        abort();
      }
      if (sz > that.maxsz)
      {
        abort();
      }
      std::swap_ranges(begin(), end(), that.begin());
      std::swap(sz, that.sz);
    }
    constexpr sz_t size(void) const noexcept { return sz; }
    constexpr sz_t max_size(void) const noexcept { return maxsz; }
    constexpr bool empty() const noexcept { return sz == 0; }
    C *begin() noexcept { return data(); }
    C &front() noexcept { return data()[0]; }
    C &back() noexcept { return sz == 0 ? data()[0] : data()[sz - 1]; }
    constexpr const C &back() const noexcept { return sz == 0 ? data()[0] : data()[sz - 1]; }
    C *end() noexcept { return data() + sz; }
    C* data() noexcept { return reinterpret_cast<C*>(array); }
    const C* data() const noexcept { return reinterpret_cast<const C*>(array); }
    const C *begin() const noexcept { return data(); }
    const C *end() const noexcept { return data() + sz; }
    const C *cbegin() const noexcept { return data(); }
    const C *cend() const noexcept { return data() + sz; }
    reverse_iterator rbegin() noexcept { return reverse_iterator(end()); }
    reverse_iterator rend() noexcept { return reverse_iterator(begin()); }
    const_reverse_iterator rbegin() const noexcept { return const_reverse_iterator(end()); }
    const_reverse_iterator rend() const noexcept { return const_reverse_iterator(begin()); }
    const_reverse_iterator crbegin() const noexcept { return const_reverse_iterator(end()); }
    const_reverse_iterator crend() const noexcept { return const_reverse_iterator(begin()); }
};

Usage of the code should be straightforward.

Upvotes: 0

Yakk - Adam Nevraumont

Reputation: 275956

There are many small vector classes out there. Boost has some. Try http://www.boost.org/doc/libs/master/doc/html/container/non_standard_containers.html#container.non_standard_containers.small_vector

An easy hand rolled solition with only a bit of overhead would be a gsl span attached to a variant of vector and std array. There is overhead over the most optimal solution of a few bytes. Interact using the span over the unknown container.

Upvotes: 1

Is there standard C++ template class for variable-length arrays with maximum size for good cache locality?

Answers (2)

Related Questions