Flexible array member and pointer member : pros and cons?

Question

What is the difference between using flexible array member (FAM) or pointer member ? In the two cases, a malloc and an affectation element by element must be done. But with FAM, a memory allocation is done for the whole structure and with ptr member, a memory allocation is done for the ptr member only (see code). What are the pros ans the cons of these two methods ?

#include 
#include 

typedef struct farr_mb {
    int lg;
    int arr[];
} Farr_mb;

typedef struct ptr_mb {
    int lg;
    int * ptr;
} Ptr_mb;

 int main() {

    int lg=5;

    Farr_mb *a=malloc(sizeof(Farr_mb)+lg*sizeof(int));
    Ptr_mb b; b.ptr=malloc(lg*sizeof(int));

    for (int i=0;iarr[i]=i;
    for (int i=0;iarr)[i]=i);
    printf("
");
    for (int i=0;i

Nominal Animal · Accepted Answer

Before we get to the pros and cons, let's look at some real-world examples.

Let's say we wish to implement a hash table, where each entry is a dynamically managed array of elements:

struct hash_entry {
    size_t              allocated;
    size_t              used;
    element             array[];
};

struct hash_table {
    size_t              size;
    struct hash_entry **entry;
};
#define HASH_TABLE_INITIALIZER { 0, NULL }

This in fact uses both. The hash table itself is a structure with two members. The size member indicates the size of the hash table, and the entry member is a pointer to an array of hash table entry pointers. This way, each unused entry is just a NULL pointer. When adding elements to a hash table entry, the entire struct entry can be reallocated (for sizeof (struct entry) + allocates * sizeof (element) or freed, as long as the corresponding pointer in the entry member in the struct hash_table is updated accordingly.

If we used element *array instead, we would need use struct hash_entry *entry: in the struct hash_table; or allocate the struct hash_entry separately from the array; or allocate both struct hash_entry and array in the single chunk, with the array pointer pointing just after the same struct hash_entry.

The cost of that would be two extra size_ts worth of memory used for each unused hash table slot, as well as an extra pointer dereference when accessing elements. (Or, to get the address of the array, two consecutive pointer dereferences, instead of one pointer dereference plus offset.) If this is a key structure heavily used in an implementation, that cost can be visible in profiling, and negatively affect cache performance. For random accesses, the larger the element array is, the less difference there is, however; the cost is largest when the arrays are small, and fit within the same cacheline (or a few cachelines) as the allocated and used members.

We do not usually want to make the entry member in the struct hash_table a flexible array member, because that would mean you no longer can declare a hash table statically, using struct hash_table my_table = HASH_TABLE_INITIALIZER;; you would need to use a pointer to a table, and an initializer function: struct hash_table *my_table; my_table = hash_table_init(); or similar.

I do have another example of related data structures using both pointer members and flexible array members. It allows one to use variables of type matrix to represent any 2D matrix with double entries, even when a matrix is a view to another (say, a transpose, a block, a row or column vector, or even a diagonal vector); these views are all equal (unlike in e.g. GNU Scientific Library, where matrix views are represented by a separate data type). This matrix representation approach makes writing robust numerical linear algebra code easy, and the ensuing code is much more readable than when using GSL or BLAS+LAPACK. In my opinion, that is.

So, let's look at the pros and cons, from the point of view of how to choose which approach to use. (For that reason, I will not designate any feature as "pro" or "con", as the determination depends on the context, on each particular use case.)

Structures with flexible array members cannot be initialized statically. You can only refer to them via pointers.

You can declare and initialize structures with pointer members. As shown in above example, using a preprocessor initializer macro can mean you do not need an initializer function. For example, a function accepting a struct hash_table *table parameter can always resize the array of pointers using realloc(table->entry, newsize * sizeof table->entry[0]), even when table->entry is NULL. This reduces the number of functions needed, and simplifies their implementation.
Accessing an array via a pointer member can require an extra pointer dereference.

If we compare the accesses to arrays in statically initialized structures with pointer to the array, to a structure with a flexible array member referred via a static pointer, the same number of dereferences are made.

If we have a function that gets the address of a structure as a parameter, then accessing an array element via a pointer member requires two pointer dereferences, whereas accessing a flexible array element requires only one pointer dereference and one offset. If the array elements are small enough and the array index small enough, so that the accessed array element is in the same cacheline, the flexible array member access is often significantly faster. For larger arrays, the difference in performance tends to be insignificant. This does vary between hardware architectures, however.
Reallocating an array via a pointer member hides the complexity from those using the structure as an opaque variable.

This means that if we have a function that receives a pointer to a structure as a parameter, and that structure has a pointer to a dynamically allocated array, the function can reallocate that array without the caller seeing any change in the structure address itself (only structure contents change).

However, if we have a function that receives a pointer to a structure with a flexible array member, reallocating the array means reallocating the entire structure. That potentially modifies the address of the structure. Because the pointer is passed by value, the modification is not visible to the caller. Thus, a function that may resize a flexible array member, must receive a pointer to a pointer to the structure with a flexible array member.

If the function only examines the contents of a structure with a flexible array member, say counts the number of elements that fulfill some criteria, then a pointer to the structure suffices; and both the pointer and the pointed-to data can be marked const. This might help the compiler produce better code. Furthermore, all the data accessed is linear in memory, which helps more complex processors manage caching more efficiently. (To do the same with an array having a pointer member, one would need to pass the pointer to the array, as well as the size field at least, as parameters to the counting function, instead of a pointer to the structure containing those values.)
An unused/empty structure with a flexible array member can be represented by a NULL pointer (to such structure). This can be important when you have an array of arrays.

With structures with flexible array members, the outer array is just an array of pointers. With structures with pointer members, the outer array can be either an array of structures, or an array of pointers to structures.

Both can support different types of sub-arrays, if the structures have a common type tag as the first member, and you use an union of those structures. (What 'use' means in this context, is unfortunately debatable. Some claim you need to access the array via the union, I claim the visibility of such an union is sufficient because anything else will break a huge amount of existing POSIX C code; basically all server-side C code using sockets.)

Those are the major ones I can think of right now. Both forms are ubiquitous in my own code, and I have had no issues with either. (In particular, I prefer using a structure free helper function that poisons the structure to help detect use-after-free bugs in early testing; and my programs do not often have any memory-related issues.)

I will edit the above list, if I find I've missed important facets. Therefore, if you have a suggestion or think I've overlooked something above, please let me know in a comment, so I can verify and edit as appropriate.

Flexible array member and pointer member : pros and cons?

Answers (1)

Related Questions