JimmyHu
JimmyHu

Reputation: 519

Memory management optimization in c++

I am trying to implement a series of transforms. The objects which represent before and after transform are class A and class B respectively for demonstrating the example with minimizing complexity. In other words, class A could be transformed to class B and class B could be inverse transformed to class A. Moreover, the data container which is using std::unique_ptr in implementation for class A and class B is extracted into class Base. The Base class is shown as below.

class Base
{
public:
    Base()                                                                  //  Default constructor
    {
        this->size = 0;
    }

    ~Base()
    {
    }

    Base(int input_size, float input_value)                                 //  Constructor
    {
        this->size = input_size;
        this->data = std::make_unique<float[]>(input_size);
        for (int loop_number = 0; loop_number < size; loop_number++) {
            data[loop_number] = input_value;
        }
    }
    std::unique_ptr<float[]> get_data()
    {
        //  deep copy
        auto return_data = std::make_unique<float[]>(size);
        for (int loop_number = 0; loop_number < size; loop_number++) {
            return_data[loop_number] = data[loop_number];
        }
        return return_data;
    }
    int get_size()
    {
        return this->size;
    }
protected:
    int size;
    std::unique_ptr<float[]> data;
};

Next, the class A and the class B inherit the class Base.

class B;

class A : public Base
{
public:
    A(int input_size, std::unique_ptr<int[]> const& input_data)                 //  constructor
    {
        std::cout << "Object A " << std::to_address(this) << " constructed.\n"; //  for observation
        this->size = input_size;
        this->data = std::make_unique<float[]>(this->size);
        for (int loop_number = 0; loop_number < input_size; loop_number++)
        {
            this->data[loop_number] = input_data[loop_number];                  //  Deep copy
        }
    }

    A& operator=(A const& InputImage)                                           //  Copy Assign
    {
        this->size = InputImage.size;
        for (int loop_number = 0; loop_number < this->size; loop_number++)
        {
            this->data[loop_number] = InputImage.data[loop_number];             //  Deep copy
        }
        return *this;
    }

    ~A()
    {
        std::cout << "Object A " << std::to_address(this) << " destructed.\n";  //  for observation
    }
    B to_B();

private:
    int transform_to_B(int input_value)
    {
        return std::cos(input_value); // For example
    }

};

class B : public Base
{
public:
    B(int input_size, std::unique_ptr<int[]> const& input_data)                 //  constructor
    {
        std::cout << "Object B " << std::to_address(this) << " constructed.\n"; //  for observation
        this->size = input_size;
        this->data = std::make_unique<float[]>(this->size);
        for (int loop_number = 0; loop_number < input_size; loop_number++)
        {
            this->data[loop_number] = input_data[loop_number];                  //  Deep copy
        }
    }
    auto to_A()
    {
        std::unique_ptr<int[]> transformed_data = std::make_unique<int[]>(this->size);
        for (int loop_number = 0; loop_number < this->size; loop_number++) {
            transformed_data[loop_number] = transform_to_A(this->data[loop_number]);
        }
        return A(this->size, transformed_data);
    }
    ~B()
    {
        std::cout << "Object B " << std::to_address(this) << " destructed.\n";  //  for observation
    }
private:
    int transform_to_A(int input_value)
    {
        return std::acos(input_value); // For example
    }
};

B A::to_B()
{
    std::unique_ptr<int[]> transformed_data = std::make_unique<int[]>(this->size);
    for (int loop_number = 0; loop_number < this->size; loop_number++) {
        transformed_data[loop_number] = transform_to_B(this->data[loop_number]);
    }
    return B(this->size, transformed_data);
}

The main function is for testing the transformation result of class A and class B.

int main()
{
    const int size_for_testing = 3840 * 2160;
    auto data_for_testing = std::make_unique<int[]>(size_for_testing);
    for (int loop_number = 0; loop_number < size_for_testing; loop_number++) {
        data_for_testing[loop_number] = 1;                              //  for example
    }
    A a_object(size_for_testing, data_for_testing);

    for (int loop_times = 0; loop_times < 1000; loop_times++)           //  for observation
    {
        //  version 1
        a_object = a_object.to_B().to_A().to_B().to_A().to_B().to_A();
    }
    return 0;
}

The console output is

Object A 00000038FC19FE28 constructed.
Object B 00000038FC19FE10 constructed.
Object A 00000038FC19FE00 constructed.
Object B 00000038FC19FDF0 constructed.
Object A 00000038FC19FDE0 constructed.
Object B 00000038FC19FDD0 constructed.
Object A 00000038FC19FDC0 constructed.
Object A 00000038FC19FDC0 destructed.
Object B 00000038FC19FDD0 destructed.
Object A 00000038FC19FDE0 destructed.
Object B 00000038FC19FDF0 destructed.
Object A 00000038FC19FE00 destructed.
Object B 00000038FC19FE10 destructed.
Object B 00000038FC19FE10 constructed.
Object A 00000038FC19FE00 constructed.
Object B 00000038FC19FDF0 constructed.
......

I know it is make sense that the mid-term objects which are created for processing are deallocated at the end of the scope in the for loop. However, if the more complex case comes up, such as a_object = a_object.to_B().to_A().to_B().to_A().to_B().to_A().to_B().to_A().to_B().to_A().to_B().to_A();, it may cause tremendous memory consumption in keeping these mid-term objects. I am curious that the concept or the philosophy of the designing "deallocating object at the end of the scope where it was declared". It maybe can be optimized based on the usage.

On the other hand, the memory usage of the separate form which shown as below is as similar as version 1.

        //  version 2
        auto temp1 = a_object.to_B();
        auto temp2 = temp1.to_A();
        auto temp3 = temp2.to_B();
        auto temp4 = temp3.to_A();
        auto temp5 = temp4.to_B();
        a_object = temp5.to_A();

In order to try to decrease the memory consumption, the Lambda Expression are also considered as below. However, the memory usage is also as similar as version 1.

        //  version 3
        auto temp1 = [](auto& input_object) { return input_object.to_B(); }(a_object);
        auto temp2 = [](auto& input_object) { return input_object.to_A(); }(temp1);
        auto temp3 = [](auto& input_object) { return input_object.to_B(); }(temp2);
        auto temp4 = [](auto& input_object) { return input_object.to_A(); }(temp3);
        auto temp5 = [](auto& input_object) { return input_object.to_B(); }(temp4);
        auto a_object = [](auto& input_object) { return input_object.to_A(); }(temp5);

By the way, it seems that this kind of Lambda expression can't be merged as below. The compiler pops up C2664 error and said 'auto main::::operator ()(_T1 &) const': cannot convert argument 1 from 'B' to '_T1 &'

a_object = [](auto& input_object) { return input_object.to_A(); }(
                        [](auto& input_object) { return input_object.to_B(); }(
                        [](auto& input_object) { return input_object.to_A(); }(
                        [](auto& input_object) { return input_object.to_B(); }(
                        [](auto& input_object) { return input_object.to_A(); }(
                        [](auto& input_object) { return input_object.to_B(); }(a_object))))));

Finally, my questions are:

1) I am curious that the concept or the philosophy of the designing "deallocating object at the end of the scope where it was declared". It maybe can be optimized based on the usage.

2) Is there some better way to decrease the memory consumption to this kind of cascade structure such as the more complex case a_object = a_object.to_B().to_A().to_B().to_A().to_B().to_A().to_B().to_A().to_B().to_A().to_B().to_A();?

Environment:

CPU: Intel® Core™ i7-6700HQ 2.6GHz

RAM: 16GB

OS: Windows 10 1909

IDE: Microsoft Visual Studio Community 2019 Version 16.4.5

Upvotes: 0

Views: 141

Answers (1)

Jarod42
Jarod42

Reputation: 217398

1) I am curious that the concept or the philosophy of the designing "deallocating object at the end of the scope where it was declared". It maybe can be optimized based on the usage.

It allow to "safety" use the temporary is that statement.

The As-if rule might permit to deallocate before if observable behavior is identical. Which is more complicated, as you have output in destructor.

2) Is there some better way to decrease the memory consumption to this kind of cascade structure such as the more complex case a_object = a_object.to_B().to_A().to_B().to_A().to_B().to_A().to_B().to_A().to_B().to_A().to_B().to_A();?

You can add overload for temporary:

B A::to_B() &&
{
    std::unique_ptr<int[]> transformed_data = std::make_unique<int[]>(this->size);
    for (int loop_number = 0; loop_number < this->size; loop_number++) {
        transformed_data[loop_number] = transform_to_B(this->data[loop_number]);
    }
    auto Bsize = this->size;
    this->size = 0;
    this->data.reset(); // We clear here.
                        // Implementation might even really transfer the buffer
    return B(Bsize, std::move(transformed_data));
}

Upvotes: 2

Related Questions