AndreyS
AndreyS

Reputation: 13

std::set<string> and memory

I use Ms Visual Studio2010 Express and have next code

set<string> res;
for(uint x = 0; x<100000;++x) res.insert( vtos(x,0) );
uint size1=0;uint size2=0;uint size3=0;
set<string>::const_iterator it=res.begin();
for(;it!=res.end();++it){
    string str = *it;
    size1+=str.size();
    size2+=str.capacity();
    size3+=str.capacity() + sizeof(string);
}
cout << "size1 == " << ((float)size1)/1024/1024 << endl;
cout << "size2 == " << ((float)size2)/1024/1024 << endl;
cout << "size3 == " << ((float)size3)/1024/1024 << endl;
while(true){}

The output for that is

size1 == 0.466242
size2 == 1.43051
size3 == 4.1008

The cycle ( in the end, it is bad thing, i know ) is only for watching TaskManager. In TaskManager I see that memory of my application is 6,11 Mb

Why is 6M? Where is ~2Mb?

If i replace set by vector( resized for 100000) the ouptput will be the same, but in task manager i see ~3,45Mb.

Why is 3 Mb?

Sorry for my poor English, thak you in advance.

Upvotes: 1

Views: 527

Answers (2)

Sunius
Sunius

Reputation: 2907

When you put items to a set, not only the items themselves will take space, but also set internal book keeping. std::set is usually implemented as a red-black tree, which means there's a node for each item in the set. On MSVC, a node looks like this:

template<class _Value_type,
class _Voidptr>
struct _Tree_node
{   // tree node
    _Voidptr _Left; // left subtree, or smallest element if head
    _Voidptr _Parent;   // parent, or root of tree if head
    _Voidptr _Right;    // right subtree, or largest element if head
    char _Color;    // _Red or _Black, _Black if head
    char _Isnil;    // true only if head (also nil) node
    _Value_type _Myval; // the stored value, unused if head

private:
    _Tree_node& operator=(const _Tree_node&);
};

As you can see, value is only a part of the node. On my PC, sizeof(string) is 28 bytes when compiled as 32-bit executable, and size of a tree node is 44 bytes.

Upvotes: 0

The Dark
The Dark

Reputation: 8514

The set size and other memory use has been answered in the comments.

The vector uses less than 4.1MB you calculated because visual studio's std::string will store small strings in a buffer that is internal to the string. If a string is larger than the buffer it will then allocate a dynamic buffer to store the string. This means that str.capacity() + sizeof(string) is not correct for values that are less than that buffer size (which is all of your strings in your case as Visual C's buffer happens to be 16 bytes).

Try running it with a bigger value in the strings. e.g. add the constant string "12345678901234567890" to each value before putting it the vector and your memory use should go up by more than just the 200k (20*10,000) for the extra data as the strings will have to start allocating dynamic buffers.

Upvotes: 1

Related Questions