Reputation: 8199
I see this in code sometimes:
struct S
{
int count; // length of array in data
int data[1];
};
Where the storage for S is allocated bigger than sizeof(S)
so that data can have more space for its array. It is then used like:
S *s;
// allocation
s->data[3] = 1337;
My question is, why is data
not a pointer? Why the length-1 array?
Upvotes: 1
Views: 603
Reputation: 12392
Because it lets you have code do this:
struct S
{
int count; // length of array in data
int data[1];
};
struct S * foo;
foo = malloc(sizeof(struct S) + ((len - 1)*sizeof(int)) );
strcpy(foo->data, buf);
Which only requires one call to malloc and one call to free.
This is common enough that the C99 standard allows you do not even specify a length of the array. It's called a flexible array member.
From ISO/IEC 9899:1999, Section 6.7.2.1, paragraph 16: "As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member." called a flexible array member."
struct S
{
int count; // length of array in data
int data[];
};
And gcc has allowed 0 length array members as the last members of structs as an extension for a while.
Upvotes: 1
Reputation: 81347
Incidentally, I don't think there's any guarantee that using a length-one array as something longer is going to work. A compiler would be free to generate effective-address code that relies upon the subscript being no larger than the specified bound (e.g. if an array bound is specified as one, a compiler could generate code that always accesses the first element, and if it's two, on some platforms, an optimizing compiler might turn a[i] into ((i & 1) ? a[1] : a[0]). Note that while I'm unaware of any compilers that actually do that transform, I am aware of platforms where it would be more efficient than computing an array subscript.
I think a standards-compliant approach would be to declare the array as [MAX_SIZE] and allocate sizeof(struct S)-(MAX_SIZE-len)*sizeof(int) bytes.
Upvotes: 0
Reputation: 20403
Because of different copy semantics. If it is a pointer inside, then the contents have to explicitly copied. If it is a C-style array inside, then the copy is automatic.
Upvotes: 0
Reputation: 311
It's done to make it easier to manage the fact that the array is sequential in memory (within the struct). Otherwise, after the memalloc that is greater than sizeof(S), you would have to point 'data' at the next memory address.
Upvotes: 1
Reputation: 755567
Raymond Chen wrote an excellent article on precisely why variable length structures chose this pattern over many others (including pointers).
He doesn't directly comment on why a pointer was chosen over an array but Steve Dispensa provides some insight in the comments section.
From Steve
typedef struct _TOKEN_GROUPS {
DWORD GroupCount;
SID_AND_ATTRIBUTES *Groups;
} TOKEN_GROUPS, *PTOKEN_GROUPS;
This would still force Groups to be pointer-aligned, but it's much less convenient when you think of argument marshalling.
In driver development, developers are sometimes faced with sending arguments from user-mode to kernel-mode via a METHOD_BUFFERED IOCTL. Structures with embedded pointers like this one represent anything from a security flaw waiting to happen to simply a PITA.
Upvotes: 6
Reputation: 320777
If you declare data
as a pointer, you'll have to allocate a separate memory block for the data
array, i.e. you'll have to make two allocations instead of one. While there won't be much difference in the actual functionality, it still might have some negative performance impact. It might increase memory fragmentation. It might result in struct memory being allocated "far away" from the data
array memory, resulting in the poor cache behavior of the data structure. If you use your own memory management routines, like pooled allocators, you'll have to set up two allocators: one for the struct and one for the array.
By using the above technique (known as "struct hack") you allocate memory for the entire struct (including data
array) in one block, with one call to malloc
(or to your own allocator). This is what it is used for. Among other things it ensures that struct memory is located as close to the array memory as possible (i.e. it is just one continuous block), so the cache behavior of the data structure is optimal.
Upvotes: 9