lulle2007200
lulle2007200

Reputation: 927

Is casting and dereferencing struct pointers of "compatible" structs allowed?

Suppose I have something like:

list.h:

//...
#include <stdlib.h>
typedef struct node_s{
    struct node_s *next;
    struct node_s *prev;

    char data[];
}node_t;
    
void* getDataFromNode(node_t *node){
    return(node->data);
}

node_t* newNode(size_t size){
    node_t *ret = malloc(sizeof(node_t));
    return(ret);
}
//...

main.c:

#include "list.h"
#include <stddef.h>
typedef struct float_node_s{
    struct foo_node_s *next;
    struct foo_node_s *prev;

    float someFloat;
}float_node_t;

int main(void){
    float *f;
    float_node_t *node;
    //1)
    node = (float_node_t*)newNode(sizeof(float_node_t));
    if(node == NULL){
        return(1);
    }
    //2)
    f = (float*)getDataFromNode((node_t*)node);
    return(0);
}

This is something I've seen in a lot of data structure (e.g., list, tree) implementations.

Am I allowed to do this?

Specifically, can I cast a node_t pointer to a float_node_t pointer and assign it to a float_node_t pointer variable like at 1)? What if I dereference the float_node_t pointer now to access the float stored in it? I guess 2) is already disallowed. Returned pointer points to a char array, it is casted to a float pointer.

The C standard says that pointers to different structs have equal representations and alignment requirements, that reordering of struct elements does not happen, and that if structs have a common initial sequence, the layout of the initial sequence of those structs will be equal.

So casting the pointers and dereferencing to access the prev/next fields seems fine, but isn't that already a violation of C's strict aliasing rules?

There are quite a few somewhat similar questions about casting pointers between "compatible" structs, but often the answers disagree or even contradict each other. Some say it is fine to access the fields of the common initial sequence through either pointer while some say you can't even dereference the casted pointer.

Upvotes: 3

Views: 490

Answers (3)

Lundin
Lundin

Reputation: 213892

There's a special rule, see C17 6.5.2.3:

One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.

So if there is a union of the two structs visible within the same translation unit, you should be able to inspect the common initial sequence of every struct no matter the type. However, compilers have shaky support for this particular rule and there's been C language defect reports about it.

Notably however, this special rule is in line with the "strict aliasing rule", which allows lvalue access of struct/union member through a compatible type, "an aggregate or union type that includes one of the aforementioned (compatible) types among its members".

However, none of this allows wild type punning between two structures where you re-interpret the part which isn't shared differently - that's just a strict aliasing violation and UB.

We shouldn't write programs relying on language lawyering of these shaky rules though. The correct solution here is:

typedef struct node
{
  struct node* next;
  struct node* prev;
} node_t;

typedef struct
{
  node_t parent;
  float  data;
} float_node_t;

typedef struct
{
  node_t parent;
  char   data[n];
} str_node_t;

This is how polymorphism works in C - you can now use float_node_t* cast to node_t* and pass it along to any function expecting a node_t*.

Optionally you can also show an enum in there to keep track of the type, if you wish the change type in run-time. And you could do polymorphism with function pointers. Here's an example:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct node
{
  struct node* next;
  struct node* prev;
  void (*print)(struct node*);
} node_t;

typedef struct
{
  node_t parent;
  float  data;
} float_node_t;
    
typedef struct
{
  node_t parent;
  char   data[100];
} str_node_t;

node_t* float_node_create (float f);
node_t* str_node_create (const char* s);

void float_node_print (struct node* this);
void str_node_print (struct node* this);

#define node_create(data)                   \
  _Generic( (data),                         \
            float: float_node_create,       \
            char*: str_node_create )(data) \

int main (void)
{
  node_t* n1 = node_create(1.0f);
  node_t* n2 = node_create("hello world");
  n1->print(n1);
  n2->print(n2);
  
  free(n1);
  free(n2);
  return 0;   
}

void float_node_print (struct node* this)
{
  printf("%f\n", ((float_node_t*)this)->data );
}

void str_node_print (struct node* this)
{
  puts( ((str_node_t*)this)->data );
}

node_t* float_node_create (float f)
{
  float_node_t* obj = malloc(sizeof *obj);
  obj->data  = f;
  obj->parent.print = float_node_print;
  return (node_t*)obj;
}

node_t* str_node_create (const char* s)
{
  str_node_t* obj = malloc(sizeof *obj);
  strcpy(obj->data,s);
  obj->parent.print = str_node_print;
  return (node_t*)obj;
}

All of which is well-defined behavior and portable standard C. This relies on a far more established and safe language rule, found in 6.7.2.1:

Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.

Upvotes: 4

0___________
0___________

Reputation: 67546

Do not use types in sizeof only objects especially in this kind of code. Using types is error-prone.

Pointer punning, in this case, is IMO invalid (it violates the strict-aliasing rules)

I would do it this way.

typedef struct node_s{
    struct node_s *next;
    struct node_s *prev;

    char data[];
}node_t;


typedef struct float_node_s{
    struct float_node_s *next;
    struct float_node_s *prev;
    
    float someFloat;
}float_node_t;

typedef union
{
    node_t node_c;
    float_node_t node_f;
}node_ut;

float getFloatDataFromNode(node_ut *node){
    return node -> node_f.someFloat;
}

void* newNode(size_t size){
    node_t *ret = malloc(sizeof(*ret) + size);
    return(ret);
}


int main(void){
    float f;
    node_ut *node;
    //1)
    node = newNode(sizeof(node -> node_f.someFloat));
    if(node == NULL){
        return(1);
    }
    //2)
    f = getFloatDataFromNode(node);
    return(0);
}

https://godbolt.org/z/c3csvoEeY

Upvotes: -1

alagner
alagner

Reputation: 4062

So casting the pointers and dereferencing to access the prev/next fields seems fine, but isn't that already a violation of C's strict aliasing rules?

It is, i.e. it does violate the strict-aliasing rule; the problem is though, that it might have become a widely used pattern because oftentimes it would compile to the expected form.

Upvotes: 3

Related Questions