C++ copy constructor for a class with an array attribute

Question

I am creating a matrix template and ran into a problem writing the copy constructor. While data appears to be copied correctly from within the constructor, the object returned to the main program does not have the correct values (looks like its pointing to a different memory address). In my attempts to debug this I tried creating a minimalist example, though strangely this did not produce the same error. I get the sense that this issue is either beyond my understanding of C++ ...or caused by a typo I've somehow missed. Can anyone spot what I've done wrong?

matlib.h

#ifndef MATLIB_H
#define MATLIB_H
#include 

    namespace matlib{
        template 
        struct Matrix {
                unsigned int rows;   //number of rows
                unsigned int cols;   //number of columns
                unsigned int length; //number of elements
                T data[];                  //contents of matrix
    
                /* Constructors */
                Matrix(unsigned int m, unsigned int n) : rows(m), cols(n) {
                    length = m*n;
                    T data[m*n];
                    //::std::cout << "Hello from the null constructor!" << ::std::endl;
                    //::std::cout << "rows = " << rows << ", cols = " << cols << ", length = " << length << ::std::endl;
                }
                
                Matrix(const Matrix &mat) {
                    rows = mat.rows;
                    cols = mat.cols;
                    length = mat.length;
                    T data[mat.length];
                    ::std::cout << "Hello from the copy constructor!" << ::std::endl;                
                    for (int i = 0; i < length; ++i) {
                        data[i] = mat.data[i];
                        ::std::cout << "data[" << i << "] = " << data[i] << ", mat.data[" << i << "] = " << mat.data[i] << ::std::endl;
                    }
                } 
    
                //Single element indexing and assigment
                T& operator() (int i, int j) {
                    return data[ i*this->cols + j ];
                }
    
                T& operator() (unsigned int i, unsigned int j) {
                    return data[ i*this->cols + j ];
                }
    
                //Single element indexing and assigment
                T& operator() (int i) {
                    return data[i];
                }
    
                T& operator() (unsigned int i) {
                    return data[i];
                } 
    
                
        };
    
    }
    
    #endif

testscript.cpp

#include 
#include "matlib.h"

int main() {
    float arr[7] = {4, 1, 6, 6, 8, 4, 2};
    matlib::Matrix mat1(1,7);

    //Assign values and print
    std::cout << "mat1 = ";
    for (int i = 0; i < mat1.length; ++i) {
        mat1(i) = arr[i];
        std::cout << mat1(i) << " ";
    }
    std::cout << "
" << std::endl;

    //Copy the object
    matlib::Matrix mat2 = mat1;

    //Print the copied values
    std::cout << "mat2 = ";
    for (int i = 0; i < mat2.length; ++i) {
        std::cout << mat2(i) << " ";
    }
    std::cout << std::endl;

    return 0;
}

Console output:

mat1 = 4 1 6 6 8 4 2 

Hello from the copy constructor!
data[0] = 4, mat.data[0] = 4
data[1] = 1, mat.data[1] = 1
data[2] = 6, mat.data[2] = 6
data[3] = 6, mat.data[3] = 6
data[4] = 8, mat.data[4] = 8
data[5] = 4, mat.data[5] = 4
data[6] = 2, mat.data[6] = 2
mat2 = 9.80909e-45 1.4013e-45 9.80909e-45 9.80909e-45 4 1 6

I'm sure many people will suggest solutions involving 'std::vector' though this is mostly a learning exercise with HPC in mind. I will likely add bound checking once this is a bit more developed.

Bitwize · Accepted Answer

Short Version: Just use std::vector. It will will make your life easier, and has far less pitfalls than a manual approach.

Long Version: You have two major problems in your code:

You are using Flexible-Arrays incorrectly (and this is a compiler extension, not standard C++), and
You are using Variable-Length Arrays incorrectly (and this is also a compiler extension, not standard C++)

1. Flexible Array Members

The first compiler-extension you use is a feature from c called flexible arrays:

struct Matrix {
    ...
    T data[];
 // ^~~~~~~~~

c allows using arrays of unsized data at the end of a struct to denote objects that may be given dynamic sizes at runtime when allocated by malloc. This is not, however, valid standard c++ and it is ill-advised to be using this since it does not fit into C++'s allocator model at all.

This should be changed out for something more coherent.

2. Variable Length Arrays

The second extension you are using is also from c, called variable-length arrays:

Matrix(unsigned int m, unsigned int n) : rows(m), cols(n) {
    ...
    T data[m*n];

This is also not valid standard C++. You cannot construct an array from runtime values in C++ -- full stop. Arrays are known at compile-time, and only at compile-time.

Additionally, and this is where you are experiencing the problem, T data[m*n] is creating a new VLA named data that shadows the flexible array also named data. So each function you define T data[m*n] or T data[other.length], you are actually creating new arrays, writing to them, and then doing nothing with them. This is why you are seeing different addresses.

Suggested Fixes

Use heap memory, perhaps with std::unique_ptr to manage things for you. Allocate the size on construction, clone it on copy.

// Construction
Matrix(unsigned int m, unsigned int n) : rows(m), cols(n), data(std::make_unique(m * n))
// where 'data' is std::unique_ptr
{ ... }

This will then require a custom copy constructor:

Matrix(const Matrix& other) : rows(other.rows), cols(other.cols), data(std::make_unique(rows * cols)){
    // Copy all elements from 'other' to 'data'
    std::copy_n(other.get(), rows * cols, data.get());
}

Or, better yet:

Use std::vector. It already knows how to do lifetime and saves you from a number of pitfalls. If you already know the max size of the vector, you can just use resize or reserve+push_back and this saves reallocation costs.
```
Matrix(unsigned int m, unsigned int n) : rows(m), cols(n), data(m * n)     
// where 'data' is std::vector
{ ... }
```
Using std::vector you can just do:
```
Matrix(const Matrix& other) = default;
```
in your class declaration, and it will use std::vector's underlying copy constructor. This is a far better approach IMO.

A separate note on "High-Performance Computing"

I encourage you to not shy away from containers like std::vector purely for the purpose of HPC.

To be blunt, developers are notoriously bad at determining what is, and is not, good for performance. The underlying hardware plays the biggest role with factors like speculative execution, branch prediction, instruction pipelining, and cache locality. Heap memory and a few extra byte-copies are usually the least of your worries unless you are repeatedly growing the container in a very tight loop.

On the contrary, heap memory is easy to move around (e.g. move constructors), since it's a pointer-copy, whereas buffer storage would be copied in totality even for moves. Additionally, c++17 introduces polymorphic allocators with different options where memory resources come from -- allowing for far faster allocation options (e.g. a virtual memory resource that allocates full pages for std::vector).

Even where performance matters: Try a solution and profile before trying to optimize it. Don't waste effort up front, because the results may surprise you. Sometimes doing more work can result in faster code in the right conditions.

C++ copy constructor for a class with an array attribute

Answers (2)

1. Flexible Array Members

2. Variable Length Arrays

Suggested Fixes

A separate note on "High-Performance Computing"

Related Questions