hteo
hteo

Reputation: 335

aligned malloc c++ implementation

I found this piece of code:

void* aligned_malloc(size_t required_bytes, size_t alignment) {
int offset = alignment - 1;
void* P = (void * ) malloc(required_bytes + offset);
void* q = (void * ) (((size_t)(p) + offset) & ~(alignment - 1));
return q;
}

that is the implementation of aligned malloc in C++. Aligned malloc is a function that supports allocating memory such that the memory address returned is divisible by a specific power of two. Example:

align_malloc (1000, 128) will return a memory address that is a multiple of 128 and that points to memory of size 1000 bytes.

But I don't understand line 4. Why sum twice the offset?

Thanks

Upvotes: 3

Views: 3473

Answers (1)

Geezer
Geezer

Reputation: 5710

Why sum twice the offset?

offset isn't exactly being summed twice. First use of offset is for the size to allocate:

void* p = (void * ) malloc(required_bytes + offset);

Second time is for the alignment:

void* q = (void * ) (((size_t)(p) + offset) & ~(alignment - 1));

Explanation: ~(alignment - 1) is a negation of offset (remember, int offset = alignment - 1;) which gives you the mask you need to satisfy the alignment requested. Arithmetic-wise, adding the offset and doing bitwise and (&) with its negation gives you the address of the aligned pointer.

How does this arithmetic work? First, remember that the internal call to malloc() is for required_bytes + offset bytes. As in, not the alignment you asked for. For example, you wanted to allocate 10 bytes with alignment of 16 (so the desired behavior is to allocate the 10 bytes starting in an address that is divisible with 16). So this malloc() from above will give you 10+16-1=25 bytes. Not necessarily starting at the right address in terms of being divisible with 16). But then this 16-1 is 0x000F and its negation (~) is 0xFFF0. And now we apply the bitwise and like this: p + 15 & 0xFFF0 which will cause every pointer p to be a multiple of 16.

But wait, why add this offset of alignment - 1 in the first place? You do it because once you get the address p returned by malloc(), the one thing you cannot do (in order to find the nearest aligned address) is look for it before p -- as this will cross outside the memory range allocated by the malloc() call, beginning at p. For this, you begin by adding alignment - 1, which, think about it, is exactly the maximum by which you'd have to advance to get your alignment.

* Thanks to user DevSolar for some additional phrasing.

Note 1: For this way to work the alignment must be a power of 2. This snippet does not enforce such a thing and so could cause unexpected behavior.

Note 2: An interesting question is how could you implement a free() version for such an allocation, with the return value from this function.

Upvotes: 7

Related Questions