adrien.pain
adrien.pain

Reputation: 453

Specific behaviour of std::string on visual studio?

I've got a project in which I need to read/write large files.

I've decided to use ifstream::read() to put those files into memory in one single pass, into an std::string. (that seems to be the fastest way to do it in c++ : http://insanecoding.blogspot.com/2011/11/how-to-read-in-file-in-c.html and http://insanecoding.blogspot.com/2011/11/reading-in-entire-file-at-once-in-c.html)

When switching between files, I then need to "reset" the std::string used as the previous memory buffer (ie, erase the char[] buffer to free memory)

I tried :

std::string::clear()
std::string::assign("")
std::string::erase(0, std::string::npos)
std::string::resize(0)
std::string::reserve(0)

but, under Visual Studio 2008, this doesn't free the memory used inside the std::string itself : its underlying buffer isn't de-allocated.

The only way I found to delete it is to call std::string::swap(std::string("")) to force changing the internal buffers between the actual std::string and the empty one in param.

I find this behaviour a bit strange...

I only tested on Visual Studio 2008, I don't know if it's a STL-standard behaviour or if it's MSVC-specific.

Could you get me some clue ?

Upvotes: 6

Views: 1914

Answers (1)

Jeffrey Yasskin
Jeffrey Yasskin

Reputation: 5692

As Vlad and Alf commented, std::string().swap(the_string) is the C++98 way to release the_string's capacity, and the_string.shrink_to_fit() is the C++11 way.

As to why clear(), erase(), resize(), etc. don't do it, this is an optimization to reduce allocations when you use a string over and over. If clear() freed the string's capacity, you'd generally have to reallocate a similar amount of space on the next iteration, which would take some time the implementation can save by keeping the capacity around. This implementation isn't guaranteed by the standard, but it's very common in implementations.

reserve() is documented with

Calling reserve() with a res_arg argument less than capacity() is in effect a non-binding shrink request. A call with res_arg <= size() is in effect a non-binding shrink-to-fit request.

which implies that implementations are more likely to release the capacity on a reserve() call. If I'm reading them right, libc++ and libstdc++ do release space when you call reserve(0), but it's plausible for VC++'s library to have made the opposite choice.

Edit: As penelope says, std::string's behavior here tends to be exactly the same as std::vector's behavior.

Upvotes: 4

Related Questions