Reputation: 106
I am a bit confused with some of the basic string implementation. I have been going through the source to understand the inner working and learn new things. I can't entirely grasp how the memory is managed.
Just some tidbits from the basic string implementation
The raw allocator is for char type
typedef typename _Alloc::template rebind<char>::other _Raw_bytes_alloc;
...then when allocating Rep is placed within the allocated buffer __size
is calculated to also fit the characters
size_type __size = (__capacity + 1) * sizeof(_CharT) + sizeof(_Rep);
void* __place = _Raw_bytes_alloc(__alloc).allocate(__size);
_Rep *__p = new (__place) _Rep;
This is how the character data is fetched from the _Rep buffer
_CharT* _M_refdata() throw()
{
return reinterpret_cast<_CharT*>(this + 1);
}
Setting up the character - for one type of way
_M_assign(__p->_M_refdata(), __n, __c);
What is bothering me is that the raw allocator is type char, but the allocated memory may hold a _Rep object, plus the character data (which does not have to be type char)
Also, why (or rather how) does the call to _M_refdata
know where the start (or end) of the character data is within the buffer (ie this+1
)
Edit: does this+1
just push the internal pointer to the next position after the _Rep
object?
I have a basic understanding of memory alignment and casting, but this seems to go beyond anything I have read up on.
Can anybody help, or point me to more informative reading material?
Upvotes: 3
Views: 1151
Reputation: 32212
You're missing the placement new. The line
_Rep *__p = new (__place) _Rep;
initializes a new _Rep
-object at __place
. The space for this has already been allocated before (meaning a placement-new doesn't allocate by itself, it's actually only a constructor call).
Pointer arithmetics in C and C++ tells you, that this + 1
is a pointer that points sizeof(*this)
bytes right of this
. Since there have been allocated (__capacity + 1) * sizeof(_CharT) + sizeof(_Rep)
bytes before, the space after the _Rep
object is used for the character data. The layout is thus like this:
| _Rep | (__capacity + 1) * _CharT |
Upvotes: 5
Reputation: 54325
Allocators, like C's malloc
, return pointers to bytes, not objects. So, the return type is either char *
or void *
.
Somewhere in the C and C++ standards, there is a clause that explicitly allows reinterpret casting between char
and any other object type. This is because C often needs to treat objects as byte arrays (as when writing to disk or a network socket) and it needs to treat byte arrays as objects (like when allocating a range of memory or reading from disk).
To protect against aliasing and optimization problems, you're not allowed to cast the same char *
to different types of objects and once you've casted a char *
to an object type, you are not allowed to modify the object's value by writing to the bytes.
Upvotes: 0