iammilind
iammilind

Reputation: 69978

Optimizing unnecessary string copying in vector<string>

Presenting the minimal code to describe the problem:

struct A {
  vector<string> v;
  // ... other data and methods
};
A obj;
ifstream file("some_file.txt");
char buffer[BIG_SIZE];
while( <big loop> ) {
  file.getline(buffer, BIG_SIZE-1);
  // process buffer; which may change its size
  obj.v.push_back(buffer);  // <------- can be optimized ??
}
...

Here 2 times string creation happens; 1st time to create the actual string object and 2nd time while copy constructing it for the vector. Demo

The push_back() operation happens millions of times and I am paying for one extra allocation those many times which is of no use for me.

Is there a way to optimize this ? I am open for any suitable change. (not categorizing this as premature optimization because push_back() happens so many times throughout the code).

Upvotes: 2

Views: 205

Answers (3)

KQ.
KQ.

Reputation: 922

You can try a couple of things. The first is obviously to enable optimization on the compiler. If you can declare it as a vector<const string> that may help.

Otherwise you might try something like:

obj.v.resize(obj.v.size()+1);
obj.v.back().swap(string(buffer));

Upvotes: 3

Rom
Rom

Reputation: 4199

Well, you get two allocations, but not both of them are of the string: one of them creates the string, while the other creates just a pointer inside of the vector (note that this depends on the compiler: some compilers/settings might indeed create two strings, but most won't). Look at this code for the demo.

One way to optimize it would be using the char* instead of the string as the template parameter (don't forget to manually delete it before killing the vector!). This way you'll get rid of one (biggest) of the allocations. Alternatively, just use your own implementation of vector: you'll be able to control every aspect of memory allocation then.

Upvotes: 3

Ed Heal
Ed Heal

Reputation: 59987

Instead of having buffer on the stack - put it onto the heap. Then use a vector of pointers. Only one

Upvotes: 0

Related Questions