Reputation: 453
I'm interested in the technique used by Sean Barrett to make a dynamic array in C for any type. Comments in the current version claims the code is safe to use with strict-aliasing optimizations: https://github.com/nothings/stb/blob/master/stb_ds.h#L332
You use it like:
int *array = NULL;
arrput(array, 2);
arrput(array, 3);
The allocation it does holds both the array data + a header struct:
typedef struct
{
size_t length;
size_t capacity;
void * hash_table;
ptrdiff_t temp;
} stbds_array_header;
The macros/functions all take a void* to the array and access the header by casting the void* array and moving back one:
#define stbds_header(t) ((stbds_array_header *) (t) - 1)
I'm sure Sean Barrett is far more knowledgeable than the average programmer. I'm just having trouble following how this type of code is not undefined behavior because of the strict aliasing rules in modern C. If this does avoid problems I'd love to understand why it does so I can incorporate it myself (maybe with a few less macros).
Upvotes: 3
Views: 498
Reputation: 140880
Lets follow the expansions of arrput
in https://github.com/nothings/stb/blob/master/stb_ds.h :
#define STBDS_REALLOC(c,p,s) realloc(p,s)
#define arrput stbds_arrput
#define stbds_header(t) ((stbds_array_header *) (t) - 1)
#define stbds_arrput(a,v) (stbds_arrmaybegrow(a,1), (a)[stbds_header(a)->length++] = (v))
#define stbds_arrmaybegrow(a,n) ((!(a) || stbds_header(a)->length + (n) > stbds_header(a)->capacity) \
? (stbds_arrgrow(a,n,0),0) : 0)
#define stbds_arrgrow(a,b,c) ((a) = stbds_arrgrowf_wrapper((a), sizeof *(a), (b), (c)))
#define stbds_arrgrowf_wrapper stbds_arrgrowf
void *stbds_arrgrowf(void *a, size_t elemsize, size_t addlen, size_t min_cap)
{
...
b = STBDS_REALLOC(NULL, (a) ? stbds_header(a) : 0, elemsize * min_cap + sizeof(stbds_array_header));
//if (num_prev < 65536) prev_allocs[num_prev++] = (int *) (char *) b;
b = (char *) b + sizeof(stbds_array_header);
if (a == NULL) {
stbds_header(b)->length = 0;
stbds_header(b)->hash_table = 0;
stbds_header(b)->temp = 0;
} else {
STBDS_STATS(++stbds_array_grow);
}
stbds_header(b)->capacity = min_cap;
return b;
}
how this type of code is not undefined behavior because of the strict aliasing
Strict aliasing is about accessing data that has different effective type than data stored there. I would argue that the data stored in the memory region pointed to by stbds_header(array)
has the effective type of the stbds_array_header
structure, so accessing it is fine. The structure members are allocated by realloc
and initialized one by one inside stbds_arrgrowf
by stbds_header(b)->length = 0;
lines.
how this type of code is not undefined behavior
I think the pointer arithmetic is fine. You can say that the result of realloc
points to an array of one stbds_array_header
structure. In other words, when doing the first stbds_header(b)->length =
inside stbds_arrgrowf
function the memory returned by realloc
"becomes" an array of one element of stbds_array_header
structures, as If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access
from https://port70.net/~nsz/c/c11/n1570.html#6.5p6 .
int *array
is assigned inside stbds_arrgrow
to point to "one past the last element of an array" of one stbds_array_header
structure. (Well, this is also the same place where an int
array starts). ((stbds_array_header *) (array) - 1)
calculates the address of the last array element by subtracting one from "one past the last element of an array". I would rewrite it as (char *)(void *)t - sizeof(stbds_array_header)
anyway, as (stbds_array_header *) (array)
sounds like it would generate a compiler warning.
Assigning to int *array
in expansion of stbds_arrgrow
a pointer to (char *)result_of_realloc + sizeof(stbds_array_header)
may very theoretically potentially be not properly aligned to int
array type, breaking If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined
from https://port70.net/~nsz/c/c11/n1570.html#6.3.2.3p7 . This is very theoretical, as stbds_array_header
structure has size_t
and void *
and ptrdiff_t
members, in any normal architecture it will have good alignment to access int
(or any other normal type) after it.
I have only inspected the code in expansions of arrput
. This is a 2000 lines of code, there may be other undefined behavior anywhere.
Upvotes: 2