Reputation: 13
I use Ms Visual Studio2010 Express and have next code
set<string> res;
for(uint x = 0; x<100000;++x) res.insert( vtos(x,0) );
uint size1=0;uint size2=0;uint size3=0;
set<string>::const_iterator it=res.begin();
for(;it!=res.end();++it){
string str = *it;
size1+=str.size();
size2+=str.capacity();
size3+=str.capacity() + sizeof(string);
}
cout << "size1 == " << ((float)size1)/1024/1024 << endl;
cout << "size2 == " << ((float)size2)/1024/1024 << endl;
cout << "size3 == " << ((float)size3)/1024/1024 << endl;
while(true){}
The output for that is
size1 == 0.466242
size2 == 1.43051
size3 == 4.1008
The cycle ( in the end, it is bad thing, i know ) is only for watching TaskManager. In TaskManager I see that memory of my application is 6,11 Mb
Why is 6M? Where is ~2Mb?
If i replace set by vector( resized for 100000) the ouptput will be the same, but in task manager i see ~3,45Mb.
Why is 3 Mb?
Sorry for my poor English, thak you in advance.
Upvotes: 1
Views: 527
Reputation: 2907
When you put items to a set, not only the items themselves will take space, but also set internal book keeping. std::set is usually implemented as a red-black tree, which means there's a node for each item in the set. On MSVC, a node looks like this:
template<class _Value_type,
class _Voidptr>
struct _Tree_node
{ // tree node
_Voidptr _Left; // left subtree, or smallest element if head
_Voidptr _Parent; // parent, or root of tree if head
_Voidptr _Right; // right subtree, or largest element if head
char _Color; // _Red or _Black, _Black if head
char _Isnil; // true only if head (also nil) node
_Value_type _Myval; // the stored value, unused if head
private:
_Tree_node& operator=(const _Tree_node&);
};
As you can see, value is only a part of the node. On my PC, sizeof(string) is 28 bytes when compiled as 32-bit executable, and size of a tree node is 44 bytes.
Upvotes: 0
Reputation: 8514
The set size and other memory use has been answered in the comments.
The vector uses less than 4.1MB you calculated because visual studio's std::string will store small strings in a buffer that is internal to the string. If a string is larger than the buffer it will then allocate a dynamic buffer to store the string.
This means that str.capacity() + sizeof(string)
is not correct for values that are less than that buffer size (which is all of your strings in your case as Visual C's buffer happens to be 16 bytes).
Try running it with a bigger value in the strings. e.g. add the constant string "12345678901234567890" to each value before putting it the vector and your memory use should go up by more than just the 200k (20*10,000) for the extra data as the strings will have to start allocating dynamic buffers.
Upvotes: 1