vurentjie
vurentjie

Reputation: 106

std::string allocation policy

I am a bit confused with some of the basic string implementation. I have been going through the source to understand the inner working and learn new things. I can't entirely grasp how the memory is managed.

Just some tidbits from the basic string implementation

What is bothering me is that the raw allocator is type char, but the allocated memory may hold a _Rep object, plus the character data (which does not have to be type char)

Also, why (or rather how) does the call to _M_refdata know where the start (or end) of the character data is within the buffer (ie this+1)

Edit: does this+1 just push the internal pointer to the next position after the _Rep object?

I have a basic understanding of memory alignment and casting, but this seems to go beyond anything I have read up on.

Can anybody help, or point me to more informative reading material?

Upvotes: 3

Views: 1151

Answers (2)

filmor
filmor

Reputation: 32212

You're missing the placement new. The line

_Rep *__p = new (__place) _Rep;

initializes a new _Rep-object at __place. The space for this has already been allocated before (meaning a placement-new doesn't allocate by itself, it's actually only a constructor call).

Pointer arithmetics in C and C++ tells you, that this + 1 is a pointer that points sizeof(*this) bytes right of this. Since there have been allocated (__capacity + 1) * sizeof(_CharT) + sizeof(_Rep) bytes before, the space after the _Rep object is used for the character data. The layout is thus like this:

| _Rep |  (__capacity + 1) * _CharT  |

Upvotes: 5

Zan Lynx
Zan Lynx

Reputation: 54325

Allocators, like C's malloc, return pointers to bytes, not objects. So, the return type is either char * or void *.

Somewhere in the C and C++ standards, there is a clause that explicitly allows reinterpret casting between char and any other object type. This is because C often needs to treat objects as byte arrays (as when writing to disk or a network socket) and it needs to treat byte arrays as objects (like when allocating a range of memory or reading from disk).

To protect against aliasing and optimization problems, you're not allowed to cast the same char * to different types of objects and once you've casted a char * to an object type, you are not allowed to modify the object's value by writing to the bytes.

Upvotes: 0

Related Questions