Reputation: 567
I want to calculate how much memory is allocated when I create and assign values to a string.
string s = "";
cout << sizeof(s) << endl;
cout << sizeof(s.at(0)) * s.capacity() << endl;
s = "1234567890qwertz";
cout << sizeof(s) << endl;
cout << sizeof(s.at(0)) * s.capacity() << endl;
Is this all memory that my string s consumes? The initial/static part that I get by simply calling sizeof(s) (being 40 bytes on my machine) plus the dynamic part - the size of one character times the allocated placeholders for making strings efficiently resizable (on my machine the string s first allocated a block of 15 bytes until the point where the text is too long, so after the second assignment the dynamic part is 31 bytes). Why not 16 and 32 bytes by the way?
Is this way of thinking about it (static + dynamic for each string is all the memory it occupies) correct?
Meaning that if I have a std::vector of strings, and I wanted to calculate all the memory for that vector as well, I would need to kind of do the same thing: I add the initial/static size of my vector to get the plus the dynamic part which means the total memory occupied by one string the way I do it above for each string inside the vector?
vector<string> cache;
// fill cache with strings of dynamic length
int size = sizeof(cache);
for (int i = 0; i < cache.size(); i++)
{
size += sizeof(cache[i]);
size += sizeof(cache[i].at(0)) * cache[i].capacity();
}
So to sum it all up, is that the correct amount of memory occupied by my "cache"?
Edit: Or do I also need to take into account that a std::vector itself also has a .capacity() >= .size() which could mean that I would actually need to do this:
for each cache.capacity()
- I would need to add sizeof(cache[i])
and additionally
for each cache.size()
- add sizeof(cache[i].at(0)) * cache[i].capacity()
??
Upvotes: 2
Views: 2478
Reputation: 45654
If you want to know how much space your std::vector<std::string>
uses, calculate it:
auto netto_memory_use(std::vector<std::string> const& x) noexcept {
return std::accumulate(
begin(x),
end(x),
sizeof x + sizeof x[0] * x.capacity(),
[](auto n, auto&& s) {
if (std::less<void*>()(data(s), &s)
|| std::greater_eq<void*>()(data(s) + s.capacity(), &s + 1))
return n + s.capacity() + 1;
return n;
});
}
I used std::less<void*>
/ std::greater_eq<void*>
to take advantage of them defining a full order, in contrast to just using the comparison-operators.
The accumulator tests for applied small-string-optimisation (SSO) before adding the string's capactiy. Of course, all 0-capacity strings could share the same statically-allocated terminator. Or capacity and/or length could be allocated together with the character-data.
Still, that should be a good approximation for the memory used, aside from memory-management-system overhead.
Upvotes: 1
Reputation: 180595
This question is going to be hard to answer. Naively you would think the total amount of memory consumed would be
vector_capacity * sizeof(std::string) + sum_of_capacity_of_each_string_in_the_vector
But this is more an upper limit, not what could be actually consumed. For instance, short string optimization allows std::string
to store the string data in the storage the string object itself consumes (what you call the static size). If that is the case then the actual space consumed would be
vector_capacity * sizeof(std::string)
and the capacity of each string in the vector would just be how much space you take up without allocating any extra space. You will need to check your implementation to see if it uses SSO and long of a string it will store in the string object to actually know if the capacity value is using the strings internal space or actually consuming additional memory. That makes the actual space consumed
vector_capacity * sizeof(std::string) + sum_of_capacity_of_each_string_in_the_vector_where_
the_capcity_is_more_than_the_sso_amount
In you calculation sizeof(cache[i].at(0))
is not needed. std::string
use char
and sizeof(char)
is guaranteed to be 1
Upvotes: 1
Reputation: 12332
There is a simple reason why the capacity of the string is one less than you expect and that is
s.c_str()
A C++ string is stored in a block of memory with capacity giving the total size and size for the used space. But a C string is 0 terminated. The C++ string reserve one extra byte at the end of the block of memory to store a 0. That way s.c_str()
is always 0 terminated.
So the memory used by the dynamic part of the string is capacity + 1.
As to the total memory consumed by a string or vector of strings NathanOliver answered that I think. But beware of vectors holding the same string multiple times.
Upvotes: 1