ruff
ruff

Reputation: 71

Why does std::array require the size as a template parameter and not as a constructor parameter?

There are many design issues I have found with this, particularly with passing std::array<> to functions. Basically, when you initialize std::array, it takes in two template parameters, <class T and size_t size>. However, when you create a function that requires and std::array, we do not know the size, so we need to create template parameters for the functions also.

template <size_t params_size> auto func(std::array<int, params_size> arr);

Why couldn't std::array take in the size at the constructor instead? (i.e.):

auto array = std::array<int>(10);

Then the functions would look less aggressive and would not require template params, as such:

auto func (std::array<int> arr);

I just want to know the design choice for std::array, and why it was designed this way.

This isn't a question due to a bug, but rather a question why std::array<> was designed in such a manner.

Upvotes: 5

Views: 1462

Answers (5)

Yakk - Adam Nevraumont
Yakk - Adam Nevraumont

Reputation: 275395

There are a few contiguous containers and ranges in C++ std. They serve different purposes. There are also a few techniques for passing them around.

I'll try to be exhaustive.

std::array<int, 7>

this is a buffer of 7 ints. They are stored within the object itself. Putting an array somewhere is putting enough storage for exactly 7 ints in that location (plus possible padding for alignment reasons, but that is at the end of the buffer).

You use this when, at compile time, you know exactly how big something is, or need to know.

std::vector<int>

this object holds ownership of a buffer of ints. The memory that holds those ints is dynamically allocated and can change at runtime. The object itself is usually 3 pointers in size. It has some strategies to grow that avoids doing N^2 work when you keep adding 1 element at a time to it.

This object can be efficiently moved -- it will steal the buffer if the old object is marked (by std::move or other ways) as being safe to steal state from.

std::span<int>

This represents an externally owned sequence of ints, possibly stored in a std::array or owned by a std::vector, or stored somewhere else. It knows where in memory it starts and when it ends.

Unlike the two above, it is not a container, but a range or a view of the contents. So you can't assign spans to each other (the semantics are confusing), and you are responsible to ensure that the source buffer lasts "long enough" that you don't use it after it is gone.

span is often used as a function argument. In your case, it probably solves most of your problem -- it lets you pass arrays of different sizes to a function, and within that function you can read or write the values.

span followed pointer semantics. That means const std::span<int> is like a int*const -- the pointer is const, but the thing pointed to is not! You are free to modify the elements in const std::span<int>. In comparison, std::span<const int> is like a int const* -- the pointer is not const, but the thing pointed to is. You are free to change what range of elements the span refers to in std::span<const int>, but you aren't allowed to modify the elements themselves.

A final technique is auto or templates. Here we implement the body of the function in the header (or equivalent) and leave the type unconstrained (or, constrained by concepts).

template<std::size_t N>
int total0( std::array<int, N> const& elems ) {
  int r = 0;
  for (int e:elems) r+=e;
  return r;
}

int total1( std::vector<int> const& elems ) {
  int r = 0;
  for (int e:elems) r+=e;
  return r;
}

int total2( std::span<int const> elems ) {
  int r = 0;
  for (int e:elems) r+=e;
  return r;
}

int total3( auto const& elems ) {
  int r = 0;
  for (int e:elems) r+=e;
  return r;
}

template<class Ints>
int total4( Ints const& elems ) {
  int r = 0;
  for (int e:elems) r+=e;
  return r;
}

notice these all have the same implementation.

total3 and total4 are identical; you need a more modern compiler to use total3 syntax.

total1 and total2 allow you to split the implementation into a cpp file away from the header file. Also, code generation isn't done for different arguments.

total0, total3 and total4 all result in different code to be generated based on the type of the arguments. This can cause binary bloat issues, especially if the body was more complex than shown, and causes build time problems in larger projects.

total1 won't work with a std::array directly. You can do total1({arr.begin(), arr.end()}) which would copy the contents to a dynamic vector before using the code.

Finally, note that span<int> is the closest you get to the C way of arr[], size. Span is, in essence, a pointer-to-first and length pair, with utility code wrapping it.

Upvotes: 1

D&#250;thomhas
D&#250;thomhas

Reputation: 10048

Not an answer, really, because I used to despise std::array<> for the same reasons as you — anything with Monadic qualities are not good design (IMNSHO).

Fortunately, C++20 has the solution: a dynamic std::span<>.

#include <array>
#include <iostream>
#include <span>

namespace detail
{
  void print( const std::span<const int> & xs )
  {
    for (size_t n = 0;  n < xs.size();  n++)
      std::cout << xs[n] << " ";
  }
}

void print( const std::span<const int> & xs )
{
  std::cout << "{ ";
  detail::print( xs );
  std::cout << "}\n";
}

void add( const std::span<int> & xs, int n )
{
  for (int & x : xs)
    x += n;
}

int main()
{
  std::array<int,5> xs { 1, 2, 4, 6, 10 };
  add( xs, 1 );
  print( xs );
}

Notice that the span itself is const in all cases, but the elements themselves are modifiable unless they too are tagged const. This is exactly what an array is like.

std::span is a C++20 object. I know that MS and maybe others had a array_view in older versions of their libraries.

tl;dr
Use std::array only to declare your array object. Pass it around with a dynamic std::span.


std::array vs C array

The use-case for std::array is actually very narrow: encapsulate a fixed-size array as a first-class container object (one that can be copied, not just referenced).

At first blush this doesn’t seem to be much of an improvement over standard C-style arrays:

typedef int myarray[10];             // (1)
using myarray = std::array<int,10>;  // (2)

void f( myarray a );

But it is! The difference is in what f() actually gets:

  1. For a C-style array, the argument is just a pointer — a reference to the caller’s data (that you can modify!). You know the size of the referenced array (10), but writing code to get that size is not straight-forward even with the usual C array-size idiom (sizeof(myarray)/sizeof(a[0]), since sizeof(a) is the size of a pointer).
  2. For the std::array, the argument value is an actual local copy of the caller’s data. If you want to be able to modify the caller’s data then you need to be explicit about declaring the formal argument as a reference type (myarray & a) or just to avoid an expensive copy (const myarray & a). This falls in line with how other C++ objects are passed. And though the size is still 10, your code can query the size of the array with the usual C++ container idiom: a.size()!

The usual way C overcomes this is to clutter the call site and formal argument lists with information about the array size so that it doesn’t get lost.

int f( int array[], size_t n )   // traditional C
{
  printf( "There are %zu elements.\n", n );
  recurse with f( array, n );
}

int main(void)
{
  int my_array[10];
  f( my_array, ARRAY_SIZE(my_array) );

The std::array way is cleaner.

int f( std::array<int,10> & array )   // C++
{
  std::cout << "There are " << array.size() << " elements.\n";
  recurse with f( array );
}

int main()
{
  std::array<int,10> my_array;
  f( my_array );

But while cleaner, it is significantly less flexible than the C array, simply because its length is fixed. A caller cannot pass a std::array<int,12> to the function, for example.

I’ll refer you to the other good answers here to consider more about container choice when handling arrayed data.

Upvotes: 3

alfC
alfC

Reputation: 16242

If you have a problem with std::array and you think std::span is a solution, now you will have two problems.

More seriously, without knowing what kind of conceptual operation is func it is difficult to tell what is the right alternative.

First, if you want or can exploit to know the size at compile-time there is nothing cooler than what you are trying to avoid.

template<std::size_t N> 
void func(std::array<int, N> arr);   // add & or && or const& if appropiate

Imagine it, knowing the size at compile time can allow you and the compiler to do all sorts of tricks, like unrolling loops completely or verifying logic at compile time (e.g. if you know the size must be smaller or bigger than a constant). Or the coolest trick of all, not needing to allocate memory for any auxiliary operation inside func (because you know the size of the problem a priori).

If you want a dynamic array, use (and pass) a std::vector.

void func(std::vector<int> dynarr);   // add & or && or const& if appropiate

But then you force your caller to use std::vector as the container.

If you want a fixed array, and it will work with everything,

template<class FixedArray>
void func(FixedArray dynarr);   // add & or && or const& if appropiate

Ask yourself, how specific is your function such that you really really want to make it work with any size of std::array but not with std::vector? Why specifically ints even?

template<class ArithmeticRange>
void func(ArithmeticRange dynarr);   // add & or && or const& if appropiate

Upvotes: 1

Henrique Bucher
Henrique Bucher

Reputation: 4474

std::array<T,N> var is intended as a better replacement for C-style arrays T var[N].

The memory space for this object is created locally, ie on the stack for local variables or inside the struct itself when defined as a member.

std::vector<T> in contrary always allocate it's element's memory in the heap.

Therefore as std::array is allocated locally, it cannot have a variable size since that space needs to be reserved at compile time. std::vector in the other hand has the ability to reallocate and resize since its memory is unbounded.

As a consequence, the big advantage of std::array in terms of performance is that it eliminates that one level of indirection that std::vector pays for its flexibility.

For example:

#include <cstdint>
#include <iostream>
#include <vector>
#include <array>

int main() {
    int a;
    char b[10];
    std::vector<char> c(10);
    std::array<char,10> d;
    struct E {
        std::array<char,10> e1;
        std::vector<char> e2{10};
    };
    E e;

    printf( "Stack address:   %p\n", __builtin_frame_address(0));
    printf( "Address of a:    %p\n", &a );
    printf( "Address of b:    %p\n", b );
    printf( "Address of b[0]: %p\n", &b[0] );
    printf( "Address of c:    %p\n", &c );
    printf( "Address of c[0]: %p\n", &c[0] );
    printf( "Address of d:    %p\n", &d );
    printf( "Address of d[0]: %p\n", &d[0] );
    printf( "Address of e:    %p\n", &e );
    printf( "Address of e1:   %p\n", &e.e1 );
    printf( "Address of e1[0]:%p\n", &e.e1[0] );
    printf( "Address of e2:   %p\n", &e.e2);
    printf( "Address of e2[0]:%p\n", &e.e2[0] );
}

Produces

Program stdout
Stack address:   0x7fffeb115ed0
Address of a:    0x7fffeb115eb0
Address of b:    0x7fffeb115ea6
Address of b[0]: 0x7fffeb115ea6
Address of c:    0x7fffeb115e80
Address of c[0]: 0x1cad2b0
Address of d:    0x7fffeb115e76
Address of d[0]: 0x7fffeb115e76
Address of e:    0x7fffeb115e40
Address of e1:   0x7fffeb115e40
Address of e1[0]:0x7fffeb115e40
Address of e2:   0x7fffeb115e50
Address of e2[0]:0x1cad2d0

Godbolt: https://godbolt.org/z/75s47T56f

Upvotes: 12

Obsidian
Obsidian

Reputation: 3897

The main purpose of a C++11 std::array<> is to be a decent replacement for C-style arrays [], especially when they're declared with new and dismissed with delete[].

The main goal here is to get an official, managed object that serves as an array, while maintaining as constant expressions everything that can be.

Principal issues with regular arrays is that since they're not actually objects, one cannot derivate a class from them (forcing you to implement iterators) and are a pain when you copy classes that uses them as object properties.

Since new, delete and delete[] return pointers, you need each time either to implement a copy constructor that will declare another array them copy its content or maintaining your own dynamic reference counter on it.

From this perpective, std::array<> is a good way to declare purely static arrays that will be managed by the language itself.

Upvotes: 0

Related Questions