How to define struct 'Atom' in C? Which is better? Why?

Question

ATOM means a pointer, which point to exclusive and constant string. A string in 'C' should be end with '\0'.

I will show two ways to define an 'ATOM TABLE' structure in 'C':

struct atom1 {
    struct atom1 *link;
    int len;
    char *str;
} *bucket[2048]

and

struct atom2 {
    struct atom2 *link;
    int len;
    char str[1];
} *bucket[2048]

So, when I want to allocate memory for these two type of ATOM, I also have two ways.

// memory + 1 for '\0'
struct atom1 *p = malloc(sizeof(*p) + len + 1);

and

// memory for '\0' is already in the define of struct atom
struct atom2 *p = malloc(sizeof(*p) + len);

So we can see, when we want to allocate memory, 'atom2' looks better. But on the other side, if we want to access the memory of the string, we will break the rule of 'C', because 'char str[1];' in 'atom2'.

Is 'atom2' really good?

Lundin · Accepted Answer

atom1 doesn't make any sense, because you should allocate memory dynamically for what str points at, not for the whole struct. As the code currently stands, there is no sound way in which you will be able to use atom1.

atom2 invokes undefined behavior. This was known as the "struct hack" in the old C standard, and was never guaranteed to work. Writing out of bounds of the fixed array is not allowed, even though you may have allocated data at the end of the struct. Because you don't know where the struct ends: it could have padding bytes.

Is 'atom2' really good?

Neither method is good, don't use either of them. In modern C, you can do this in a safe manner, by using a flexible array member:

typedef struct atom3 
{
  struct atom3* link;
  size_t        lenght;
  char          str[];
} atom3_t;

And then allocate memory as:

atom3_t* p = malloc(sizeof(*p) + length + 1);

After that, you can safely use str as if it was any array with size length + 1.

How to define struct 'Atom' in C? Which is better? Why?

Answers (2)

Related Questions

How to define struct &#39;Atom&#39; in C? Which is better? Why?

Answers (2)

Related Questions

How to define struct 'Atom' in C? Which is better? Why?