Nikolay Stoynov
Nikolay Stoynov

Reputation: 19

Allocating additional members of a dynamic array in c++

I am trying to find out if I can allocate additional memory for additional members for a dynamic array in C++. The code below is stripped down to just essential stuff for simplicity's sake.

What I am basically trying to do is read elements into arr in someClass from a file, but I want to make it so the file does not have to specify how many elements it contains (this is essential for the full project I am working on). I obviously thought of the solution of allocating a new array with a size+1, and then copying all the current elements into it and reading a new element, but I find that this would be a brutal waste of processing time and memory if it is possible to simply allocate the next memory address and use it for my array. Is there a way to do that in C++?

To summarize, I want to read an element, allocate a new memory address for my array, and then read another element into that newly allocated memory address. Repeat until the file ends.

I won't bore you with the reasons, but using a simple std::vector is not an option.

Here is the code:

class someClass {
public:
    int *arr;
    void Read(ifstream&,int&);
};

void someClass::Read(ifstream &inFile, int &size) {
    arr = new int[0];
    inFile.open("input.txt");

    int index = 0;
    int element;
    while (!inFile.eof()) {
        inFile >> element;
        *(arr + index) = element;
        index ++;
    }

    size = index;
    inFile.close();
}

int main() {

    int size;
    someClass a;
    ifstream inFile;

    a.Read(inFile,size);
    //obviously unnecessary, just for testing
    for(int i = 0; i < size; i ++) {
        cout << a.arr[i] << " ";
    }
    cout << endl;
}

Upvotes: 1

Views: 256

Answers (5)

Ahmad Siavashi
Ahmad Siavashi

Reputation: 999

I just liked the question and did some experiments myself, using MSVC14 compiler (optimizations disabled).
C++11/14 has the following sequence containers (intentionally excluded dynarry introduced in C++14):

  1. No dynamic resizing (up to the programmer to allocate and deallocate)
    • Raw array (e.g. int char[])
    • Array (e.g. new array<int, size>(){...})
  2. With Dynamic resizing
    • Vector (consecutive memory allocation)
    • list (linked-list like array)
    • forward_list (similar to list)
    • deque (double ended queue)

Let me start with your questions,

the solution of allocating a new array with a size+1, and then copying all the current elements into it and reading a new element, but I find that this would be a brutal waste of processing time and memory

You are right, but to mitigate the overhead, when you allocate memory to use and then you figure out you need more memory than allocated, you need to allocate new memory and copy the previous data, then free the previous allocated memory.
But wait! How much to allocated (size+1 is bad)? Each time you are forced to allocate bigger chunk of memory, you better allocate twice the size you had already in hand so that you reduce the probability of another memory reallocation; because it is considered an extremely expensive operation.

if it is possible to simply allocate the next memory address and use it for my array. Is there a way to do that in C++?

It's not totally in your control as C++ runtime has implemented memory management functions. Where your newly allocated memory will be, is not in your control, however sometimes it happens that the newly allocated space will have the same base address as the previous one; it depends on the runtime and the memory fragmentation it faces.

I got some benchmarks using malloc and realloc functions borrowed from C. Here is the code:

    auto start = chrono::steady_clock::now();

    auto partialSize = 100;
    auto arr = (int *) malloc(sizeof(int) * partialSize);


    for (auto i = 0; i < SIZE; i++) {
        arr[i] = i;
        if (i == partialSize - 1) {
            partialSize = partialSize << 1; // for 2X
            arr = (int *) realloc(arr, sizeof(int) * partialSize);
        }
    }

    auto duration = chrono::steady_clock::now() - start;

    free(arr);

    cout << "Duration: " << chrono::duration_cast<chrono::milliseconds>(duration).count() << "ms" << endl;

Results (for insertion of 100,000,000 integers; time is avg. of 3 runs):

  • Start Size = 100, Increment Steps = 1.5X, Time(s) = 1.35s
  • Start Size = 100, Increment Steps = 2X, Time(s) = 0.65s
  • Start Size = 100, Increment Steps = 4X, Time(s) = 0.42s

  • Start Size = 10,000, Increment Steps = 1.5X, Time(s) = 0.96s

  • Start Size = 10,000, Increment Steps = 2X, Time(s) = 0.79s
  • Start Size = 10,000, Increment Steps = 4X, Time(s) = 0.51s

    Another case is using C++'s new keyword and checking for relocation:

    auto start = chrono::steady_clock::now();
    auto partialSize = 100;
    auto arr = new int[partialSize];
    
    
    for (auto i = 0; i < SIZE; i++) {
        arr[i] = i;
        if (i == partialSize - 1) {
            auto newArr = new int[partialSize << 2]; // for 4X
            partialSize = partialSize << 2;
            arr = newArr;
        }
    }
    
    auto duration = chrono::steady_clock::now() - start;
    
    delete[] arr;
    
    cout << "Duration: " << chrono::duration_cast<chrono::milliseconds>(duration).count() << "ms" << endl;  
    

Results (for insertion of 100,000,000 integers; time is avg. of 3 runs):

  • Start Size = 100, Increment Steps = 1.5X, Time(s) = 0.63s
  • Start Size = 100, Increment Steps = 2X, Time(s) = 0.44s
  • Start Size = 100, Increment Steps = 4X, Time(s) = 0.36s

  • Start Size = 10,000, Increment Steps = 1.5X, Time(s) = 0.65s

  • Start Size = 10,000, Increment Steps = 2X, Time(s) = 0.52s
  • Start Size = 10,000, Increment Steps = 4X, Time(s) = 0.42s

For the rest (dynamic resizable containers):

auto start = chrono::steady_clock::now();

//auto arr = vector<int>{};
//auto arr = list<int>{};
//auto arr = new std::array<int, SIZE>{};
//auto arr = new int[SIZE];
//auto arr = deque<int>{};
auto arr = forward_list<int>{};

for (auto i = 0; i < SIZE; i++) {
    arr.push_front(i);
    // arr.push_back(i)
}

auto duration = chrono::steady_clock::now() - start;

cout << "Duration: " << chrono::duration_cast<chrono::milliseconds>(duration).count() << "ms" << endl;

Results (for insertion of 100,000,000 integers; time is avg. of 3 runs):

  1. vector

    • Time(s) = 2.17s
  2. list

    • Time(s) = 10.31s
  3. array (no reallocation)

    • Time(s) = N/A; Error: Compiler is out of heap.
  4. raw int array (no reallocation)

    • Time(s) = 0.22s
  5. deque

    • Time(s) = 3.47s
  6. forward_list

    • Time(s) = 8.78s

Hope it helps.

Upvotes: 1

Matt
Matt

Reputation: 2802

You can read the number of elements in the file by counting the delimiters and then size the array to that. For example let's assume that your file is delimited by lines as:

1
2
3
4
5

You can count the number of lines in the file with the appropriate line separator. On linux it can be done by:

int elemCount = std::count(std::istreambuf_iterator<char>(inFile),
    std::istreambuf_iterator<char>(), '\n');
inFile.clear();
inFile.seekg(0, std::ios::beg);

Then you can allocate the array as:

arr = new int[elemCount];

If you are using space or tab delimited than change from '\n' to ' ' or '\t' or whatever. You can then read in your information as before. You may need to add or subtract 1 depending on your delimiter and how the file is built. This is also a little dangerous as empty rows, double delimiters, etc could mess up the count. If I did this I would fill the array with some default value and then remove them after reading to be sure all my values were good. This would require one resize after all the reading is done.

Upvotes: 0

alessandrolenzi
alessandrolenzi

Reputation: 91

I will propose you a different solution, in which you would not need to copy every time the previously readed content of the file. The idea is to use a linked list, in which every element has a size equal to the double of its predecessor (another possibility is to make it grow like a Fibonacci series). In this way, allocations become rarer and rarer as the size of the file grows and, in case you need it, you can free memory from the beginning of the file by freeing the first elements in the list. Of course, you pay more while reading since access is not sequential. Here is an example code illustrating the idea:

struct buffer_list
{
    void append_next_chunk(size_t size, char * buff)
    {
        if(buffer == nullptr) {
            buffer = buff;
            local_size = size;
            return;
        }
        if(next == nullptr) next = new buffer_list();
        next->append_next_chunk(size, buff);
    }


    char read(int offset)
    {
        if(offset >= local_size) return next->read(offset-local_size);
        return buffer[offset];
    }
    buffer_list * next = nullptr;
    char *buffer = nullptr;
    size_t local_size = 0;
    ~buffer_list()
    {
        delete[] buffer;
        delete next;
    }
};


struct custom_vector
{
    custom_vector(const size_t size) {
        write_ptr = new char[size];
        inner_list.append_next_chunk(size, write_ptr);
        total_size = size;
        last_created_size = size;
    }


    void push_back(char c){
        if(written_size == total_size)
        {
            last_created_size *= 2;
            write_ptr = new char[last_created_size];
            write_offset = total_size;
            inner_list.append_next_chunk(last_created_size, write_ptr);
            total_size += last_created_size;
        }
        write_ptr[written_size - write_offset] = c;
        written_size++;
    }

    char read(int offset)
    {
        return inner_list.read(offset);
    }

    size_t size() { return written_size; }

    char * write_ptr = nullptr;
    buffer_list inner_list;
    size_t written_size = 0;
    size_t total_size = 0;
    size_t write_offset = 0;
    size_t last_created_size = 0;
}; 

On my machine the custom_vector performes way better than std::vector on write operations, while a big penalty is paid while reading. However i think that some optimizations for sequential read can be easily implemented solving the issue.

Upvotes: 0

nishantsingh
nishantsingh

Reputation: 4662

arr = new int[1];
int capacity = 1, size = 0;
inFile.open("input.txt");

int element;
while (!inFile.eof()) {
    inFile >> element;
    if (size == capacity){
        capacity *= 2;
        int * newbuf = new int[capacity];
        std::copy_n(arr, size, newbuf);
        delete[] arr;
        arr = newbuf;
    }
    arr[size] = element;
    size++;
}

size = index;

inFile.close();

You can simulate what a std::vector does. Double the capacity every time it gets full.

Upvotes: 0

You will need to make a new space for the bigger Array and move the old values:

void resize() {
    size_t newSize = size * 2;
    int* newArr = new int[newSize];
    memcpy( newArr, arr, size * sizeof(int) );
    size = newSize;
    delete [] arr;
    arr = newArr;
}

Upvotes: 0

Related Questions