Byte order of fields when parsing network header

Question

When parsing TCP header, there's a field named data offset with length of 4 bits. When parsing the header, fields need to be reversed to host oder. Here comes the question: when reversing these fields that are not 16- or 32-bits long which means I can't use ntohs and ntohl, do I reverse them field-wise or byte-wise, or in another way?

Let's suppose one byte contains two fields, f1 and f2 of size 4 bits each. The data is 1000 0100. For the field-wise reversal, the result should be 0001 0010. For the byte-wise reversal, the result is 0010 0001. Which one is correct?

Update:

Here is the struct I'm using to parse the header:

#pragma pack(push, 1)
struct tcp_hdr_t {
  uint16_t src_port;
  uint16_t dst_port;
  uint32_t seq;
  uint32_t ack;
  uint8_t data_offset : 4;
  uint8_t f_reserved : 3;
  uint8_t f_ns : 1;
  uint8_t f_cwr : 1;
  uint8_t f_ece : 1;
  uint8_t f_urg : 1;
  uint8_t f_ack : 1;
  uint8_t f_psh : 1;
  uint8_t f_rst : 1;
  uint8_t f_syn : 1;
  uint8_t f_fin : 1;
  uint16_t window_size;
  uint16_t checksum;
  uint16_t urgent_p;
};
#pragma pack(pop)

If I don't reverse field of data offset and flags, the result is wrong compared with it from Wireshark.

As you can see, the raw data is 0xa002, but the result seems like 0xa for data offset doesn't need to be reversed, but the part of flags seems reversed.

dbush · Accepted Answer

The problem you're seeing has to do with how bit fields are implemented.

From section 6.7.2.1p11 of the C standard regarding struct an union specifiers:

An implementation may allocate any addressable storage unit large enough to hold a bit- field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.

What this means is that you can't portably depend on the ordering of bit fields. As an example of this, the file /usr/include/netinet/tcp.h on Linux contains the following:

struct tcphdr
  {
    __extension__ union
    {
      struct
      {
    u_int16_t th_sport;     /* source port */
    u_int16_t th_dport;     /* destination port */
    tcp_seq th_seq;     /* sequence number */
    tcp_seq th_ack;     /* acknowledgement number */
# if __BYTE_ORDER == __LITTLE_ENDIAN
    u_int8_t th_x2:4;       /* (unused) */
    u_int8_t th_off:4;      /* data offset */
# endif
# if __BYTE_ORDER == __BIG_ENDIAN
    u_int8_t th_off:4;      /* data offset */
    u_int8_t th_x2:4;       /* (unused) */
# endif
    u_int8_t th_flags;
# define TH_FIN 0x01
# define TH_SYN 0x02
# define TH_RST 0x04
# define TH_PUSH    0x08
# define TH_ACK 0x10
# define TH_URG 0x20
    u_int16_t th_win;       /* window */
    u_int16_t th_sum;       /* checksum */
    u_int16_t th_urp;       /* urgent pointer */
      };
      struct
      {
    u_int16_t source;
    u_int16_t dest;
    u_int32_t seq;
    u_int32_t ack_seq;
# if __BYTE_ORDER == __LITTLE_ENDIAN
    u_int16_t res1:4;
    u_int16_t doff:4;
    u_int16_t fin:1;
    u_int16_t syn:1;
    u_int16_t rst:1;
    u_int16_t psh:1;
    u_int16_t ack:1;
    u_int16_t urg:1;
    u_int16_t res2:2;
# elif __BYTE_ORDER == __BIG_ENDIAN
    u_int16_t doff:4;
    u_int16_t res1:4;
    u_int16_t res2:2;
    u_int16_t urg:1;
    u_int16_t ack:1;
    u_int16_t psh:1;
    u_int16_t rst:1;
    u_int16_t syn:1;
    u_int16_t fin:1;
# else
#  error "Adjust your  defines"
# endif
    u_int16_t window;
    u_int16_t check;
    u_int16_t urg_ptr;
      };
    };
};

You can see here the hoops that need to be jumped through to get things in the right place. Other implementations might do it differently.

The best way to handle this in your code is to get rid of the bitfields and replace them with a pair of uint8_t members and use bitmasks to extract the necessary subfields.

For example:

struct tcp_hdr_t {
  uint16_t src_port;
  uint16_t dst_port;
  uint32_t seq;
  uint32_t ack;
  uint8_t offset_flags1;
  uint8_t flags2;
  uint16_t window_size;
  uint16_t checksum;
  uint16_t urgent_p;
};

#define DATA_OFFSET(hdr) (((hdr).offset_flags1 & 0x0f) >> 4)
#define FLAG_NONCE(hdr)  (((hdr).offset_flags1 & 0x01) >> 0)
#define FLAG_CWR(hdr)    (((hdr).flags2 & 0x80) >> 7)
#define FLAG_ECE(hdr)    (((hdr).flags2 & 0x40) >> 6)
#define FLAG_URG(hdr)    (((hdr).flags2 & 0x20) >> 5)
#define FLAG_ACK(hdr)    (((hdr).flags2 & 0x10) >> 4)
#define FLAG_PSH(hdr)    (((hdr).flags2 & 0x08) >> 3)
#define FLAG_RST(hdr)    (((hdr).flags2 & 0x04) >> 2)
#define FLAG_SYN(hdr)    (((hdr).flags2 & 0x02) >> 1)
#define FLAG_FIN(hdr)    (((hdr).flags2 & 0x01) >> 0)

Byte order of fields when parsing network header

Answers (2)

Related Questions