How socket family functions can work with different types of structures?

How socket functions, connect, bind, accept, etc, can work with different types of structures with different sizes? For example, connect() takes a struct sockaddr as its second argument, but it is OK to pass a struct sockaddr_in or struct sockaddr_in6 as well, provided that the third argument socklen_t namelen, has the right value. But actually, these structures have different format:

struct sockaddr {
  uint8_t      sa_len;
  sa_family_t  sa_family;
  char         sa_data[14]; 
};

struct sockaddr_in {
  uint8_t         sin_len;   
  sa_family_t     sin_family; 
  in_port_t       sin_port;   
  struct in_addr  sin_addr;    
  char            sin_zero[8]; 
};

struct sockaddr_in6 {
  uint8_t         sin6_len; 
  sa_family_t     sin6_family; 
  in_port_t       sin6_port; 
  uint32_t        sin6_flowinfo; 
  struct in6_addr sin6_addr; 
  uint32_t        sin6_scope_id;
};

struct sockaddr_storage {
  uint8_t      ss_len; 
  sa_family_t  ss_family;
      /* implementation-dependent fields */
};

These structures seem to have nothing in common (actually, only sin_len and ss_family, but sin_len itself is not portable and not all platforms support it), but we use same function for all of them. I don't think these functions merely depend on the third argument (namelen), because depending on the actual size of an object, to determine its type, is non-portable.

Upvotes: 1

Views: 400

Answers (2)

Remy Lebeau
Remy Lebeau

Reputation: 597600

The one thing the structures have in common is that they all start with a family field (the len field is not present on all platforms) that is at the same offset and size for all sockaddr_... types. That field, coupled with the socket's actual address type (established by socket() or accept()) is enough for each function to validate the size and format of any sockaddr you pass in, and thus they can report errors on mismatches.

sockaddr_storage is designed to be large enough in size to hold any other sockaddr_... struct type. You can pass a sockaddr_storage to any of the functions. You can type-cast it to any other sockaddr_... type. As such, you can type-cast a sockaddr_storage based on its ss_family field. For input, type-cast to the desired sockaddr_... type and populate its fields as needed, including family. For output, look at the ss_family field and then type-cast to the appropriate sockaddr_... type as needed.

For example, if the socket's address type is AF_INET (IPv4), connect() requires the sockaddr buffer to be in sockaddr_in format and the namelen parameter to be at least sizeof(sockaddr_in). Likewise, accept() populates the sockaddr buffer with data in sockaddr_in format and the addrlen parameter must be at least sizeof(sockaddr_in).

For AF_INET6 (IPv6), replace sockaddr_in with sockaddr_in6 instead.

The same applies to the other functions.

In general, the size of your sockaddr buffer must be large enough to hold the correct sockaddr_... struct that belongs to the socket's address type. Functions that accept an address as input (connect(), bind(), sendto()) require the buffer to be formatted in the correct sockaddr_... format. Functions that return an address as output (accept(), recvfrom()) will format the data using the appropriate sockaddr_... type.

Upvotes: 1

Jonathan Leffler
Jonathan Leffler

Reputation: 754710

It is in part a legacy from pre-standard C, when things were more lax. For example, the code predates function prototypes.

Actually, those structures all have 2 things in common:

  1. The first field is a uint8_t and gives the length of the structure.
  2. The second field is a sa_family_t that identifies which structure type is in use.

The C standard says:

§6.5.2.3 Structure and union members

¶6 One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.

Taking minor liberties, if you consider the type passed to the socket functions to be a pointer to a union of the various separate types, then you can see that the code can access those first two fields to determine what type is actually in use.

I'm not convinced that this design is what you'd come up with now if you were redesigning the sockets system from scratch, but it can still be made to work.

Upvotes: 1

Related Questions