Reputation: 19
I am trying to find out if I can allocate additional memory for additional members for a dynamic array in C++. The code below is stripped down to just essential stuff for simplicity's sake.
What I am basically trying to do is read elements into arr
in someClass
from a file, but I want to make it so the file does not have to specify how many elements it contains (this is essential for the full project I am working on). I obviously thought of the solution of allocating a new array with a size+1
, and then copying all the current elements into it and reading a new element, but I find that this would be a brutal waste of processing time and memory if it is possible to simply allocate the next memory address and use it for my array. Is there a way to do that in C++?
To summarize, I want to read an element, allocate a new memory address for my array, and then read another element into that newly allocated memory address. Repeat until the file ends.
I won't bore you with the reasons, but using a simple std::vector
is not an option.
Here is the code:
class someClass {
public:
int *arr;
void Read(ifstream&,int&);
};
void someClass::Read(ifstream &inFile, int &size) {
arr = new int[0];
inFile.open("input.txt");
int index = 0;
int element;
while (!inFile.eof()) {
inFile >> element;
*(arr + index) = element;
index ++;
}
size = index;
inFile.close();
}
int main() {
int size;
someClass a;
ifstream inFile;
a.Read(inFile,size);
//obviously unnecessary, just for testing
for(int i = 0; i < size; i ++) {
cout << a.arr[i] << " ";
}
cout << endl;
}
Upvotes: 1
Views: 256
Reputation: 999
I just liked the question and did some experiments myself, using MSVC14
compiler (optimizations disabled).
C++11/14 has the following sequence containers (intentionally excluded dynarry
introduced in C++14):
int char[]
)new array<int, size>(){...}
)Let me start with your questions,
the solution of allocating a new array with a size+1, and then copying all the current elements into it and reading a new element, but I find that this would be a brutal waste of processing time and memory
You are right, but to mitigate the overhead, when you allocate memory to use and then you figure out you need more memory than allocated, you need to allocate new memory and copy the previous data, then free the previous allocated memory.
But wait! How much to allocated (size+1 is bad)? Each time you are forced to allocate bigger chunk of memory, you better allocate twice the size you had already in hand so that you reduce the probability of another memory reallocation; because it is considered an extremely expensive operation.
if it is possible to simply allocate the next memory address and use it for my array. Is there a way to do that in C++?
It's not totally in your control as C++ runtime has implemented memory management functions. Where your newly allocated memory will be, is not in your control, however sometimes it happens that the newly allocated space will have the same base address as the previous one; it depends on the runtime and the memory fragmentation it faces.
I got some benchmarks using malloc
and realloc
functions borrowed
from C. Here is the code:
auto start = chrono::steady_clock::now();
auto partialSize = 100;
auto arr = (int *) malloc(sizeof(int) * partialSize);
for (auto i = 0; i < SIZE; i++) {
arr[i] = i;
if (i == partialSize - 1) {
partialSize = partialSize << 1; // for 2X
arr = (int *) realloc(arr, sizeof(int) * partialSize);
}
}
auto duration = chrono::steady_clock::now() - start;
free(arr);
cout << "Duration: " << chrono::duration_cast<chrono::milliseconds>(duration).count() << "ms" << endl;
Results (for insertion of 100,000,000 integers; time is avg. of 3 runs):
Start Size = 100, Increment Steps = 4X, Time(s) = 0.42s
Start Size = 10,000, Increment Steps = 1.5X, Time(s) = 0.96s
Start Size = 10,000, Increment Steps = 4X, Time(s) = 0.51s
Another case is using C++'s new
keyword and checking for relocation:
auto start = chrono::steady_clock::now();
auto partialSize = 100;
auto arr = new int[partialSize];
for (auto i = 0; i < SIZE; i++) {
arr[i] = i;
if (i == partialSize - 1) {
auto newArr = new int[partialSize << 2]; // for 4X
partialSize = partialSize << 2;
arr = newArr;
}
}
auto duration = chrono::steady_clock::now() - start;
delete[] arr;
cout << "Duration: " << chrono::duration_cast<chrono::milliseconds>(duration).count() << "ms" << endl;
Results (for insertion of 100,000,000 integers; time is avg. of 3 runs):
Start Size = 100, Increment Steps = 4X, Time(s) = 0.36s
Start Size = 10,000, Increment Steps = 1.5X, Time(s) = 0.65s
For the rest (dynamic resizable containers):
auto start = chrono::steady_clock::now();
//auto arr = vector<int>{};
//auto arr = list<int>{};
//auto arr = new std::array<int, SIZE>{};
//auto arr = new int[SIZE];
//auto arr = deque<int>{};
auto arr = forward_list<int>{};
for (auto i = 0; i < SIZE; i++) {
arr.push_front(i);
// arr.push_back(i)
}
auto duration = chrono::steady_clock::now() - start;
cout << "Duration: " << chrono::duration_cast<chrono::milliseconds>(duration).count() << "ms" << endl;
Results (for insertion of 100,000,000 integers; time is avg. of 3 runs):
vector
list
array (no reallocation)
raw int array (no reallocation)
deque
forward_list
Hope it helps.
Upvotes: 1
Reputation: 2802
You can read the number of elements in the file by counting the delimiters and then size the array to that. For example let's assume that your file is delimited by lines as:
1
2
3
4
5
You can count the number of lines in the file with the appropriate line separator. On linux it can be done by:
int elemCount = std::count(std::istreambuf_iterator<char>(inFile),
std::istreambuf_iterator<char>(), '\n');
inFile.clear();
inFile.seekg(0, std::ios::beg);
Then you can allocate the array as:
arr = new int[elemCount];
If you are using space or tab delimited than change from '\n'
to ' '
or '\t'
or whatever. You can then read in your information as before. You may need to add or subtract 1 depending on your delimiter and how the file is built. This is also a little dangerous as empty rows, double delimiters, etc could mess up the count. If I did this I would fill the array with some default value and then remove them after reading to be sure all my values were good. This would require one resize after all the reading is done.
Upvotes: 0
Reputation: 91
I will propose you a different solution, in which you would not need to copy every time the previously readed content of the file. The idea is to use a linked list, in which every element has a size equal to the double of its predecessor (another possibility is to make it grow like a Fibonacci series). In this way, allocations become rarer and rarer as the size of the file grows and, in case you need it, you can free memory from the beginning of the file by freeing the first elements in the list. Of course, you pay more while reading since access is not sequential. Here is an example code illustrating the idea:
struct buffer_list
{
void append_next_chunk(size_t size, char * buff)
{
if(buffer == nullptr) {
buffer = buff;
local_size = size;
return;
}
if(next == nullptr) next = new buffer_list();
next->append_next_chunk(size, buff);
}
char read(int offset)
{
if(offset >= local_size) return next->read(offset-local_size);
return buffer[offset];
}
buffer_list * next = nullptr;
char *buffer = nullptr;
size_t local_size = 0;
~buffer_list()
{
delete[] buffer;
delete next;
}
};
struct custom_vector
{
custom_vector(const size_t size) {
write_ptr = new char[size];
inner_list.append_next_chunk(size, write_ptr);
total_size = size;
last_created_size = size;
}
void push_back(char c){
if(written_size == total_size)
{
last_created_size *= 2;
write_ptr = new char[last_created_size];
write_offset = total_size;
inner_list.append_next_chunk(last_created_size, write_ptr);
total_size += last_created_size;
}
write_ptr[written_size - write_offset] = c;
written_size++;
}
char read(int offset)
{
return inner_list.read(offset);
}
size_t size() { return written_size; }
char * write_ptr = nullptr;
buffer_list inner_list;
size_t written_size = 0;
size_t total_size = 0;
size_t write_offset = 0;
size_t last_created_size = 0;
};
On my machine the custom_vector
performes way better than std::vector
on write operations, while a big penalty is paid while reading.
However i think that some optimizations for sequential read can be easily implemented solving the issue.
Upvotes: 0
Reputation: 4662
arr = new int[1];
int capacity = 1, size = 0;
inFile.open("input.txt");
int element;
while (!inFile.eof()) {
inFile >> element;
if (size == capacity){
capacity *= 2;
int * newbuf = new int[capacity];
std::copy_n(arr, size, newbuf);
delete[] arr;
arr = newbuf;
}
arr[size] = element;
size++;
}
size = index;
inFile.close();
You can simulate what a std::vector
does. Double the capacity every time it gets full.
Upvotes: 0
Reputation: 2442
You will need to make a new space for the bigger Array and move the old values:
void resize() {
size_t newSize = size * 2;
int* newArr = new int[newSize];
memcpy( newArr, arr, size * sizeof(int) );
size = newSize;
delete [] arr;
arr = newArr;
}
Upvotes: 0