camixetew
camixetew

Reputation: 89

C libpcap API casting packet to struct(confusing)

After reading this tutorial

(https://www.tcpdump.org/pcap.html)

On the very bottom the author is casting a u_char *packet pointer into a struct.

Does such casting work like this

Lets say I have this struct

struct 16bits{ 
 int8_t a;
 int8_t b;
 }

and a 16bit sequence

0001 0011 0111 1111

and if I cast it to 16bits struct it would look like this ?

a = 0001 0011
b = 0111 1111

The question is if I understand the authors casting correctly.

I am aware about padding in structs, but lets think the compiler doesn't add it for a moment

Upvotes: 3

Views: 91

Answers (1)

Andrew Henle
Andrew Henle

Reputation: 1

Lets say I have this struct

struct 16bits{ 
 int8_t a;
 int8_t b;
 }

and a 16bit sequence

0001 0011 0111 1111

and if I cast it to 16bits struct it would look like this ?

a = 0001 0011
b = 0111 1111

I assume you mean something like:

// this points at your 16-bit sequence
unsigned char *input_data = ...

struct 16bits *output_data = ( struct 16bits * ) input_data;

uint8_t a_bits = output_data->a;
uint8_t b_bits = output_data->b;

In general, no you cannot assume you can do that. In general, that would be a strict aliasing violation and undefined behavior. The "strict aliasing" rule basically says you can't treat memory as something it's not - but with the exception that you can always access anything one char at a time. An int is not a float.

In addition, as you mention there can be padding between fields in a structure.

In your specific example, though, it's almost certain to "work" on just about any platform, though, because int8_t is almost certainly a signed char, there's almost certainly no padding in the struct 16bits, and any memory can always be accessed as a char value.

Replace the char types with types such as double or int64_t, though, and you can run into alignment as well as padding issues. On some platforms, such strict aliasing violations can cause code to fail with SIGSEGV or SIGBUS.

Assuming 8-bit char values so int8_t is actually a char, a fully standards-compliant way of accessing any data type as applied to your 16-bit sequence as two 8-bit values would be

// assume this points to your 16-bit sequence
unsigned char *input_data = ...

// create a structure that we can actually copy the bits into
struct 16bits output_data;

memcpy( &output_data, input_data, sizeof( output_data ) );

Note that if the structure contains elements of a type other that char, there may be padding. And if you use something like #pragma pack to eliminate the padding, you can wind up with code that doesn't run on some platforms.

Code such as that in the link you provided is rampant - and it's actually undefined behavior. But it "works" because the x86 platform most popular published code is written on is very, very, very forgiving of misaligned accesses (although there is still a performance penalty). But that type of code won't work well at all on any platform that does have alignment requirements. Just Google pragma pack sigbus and you'll find lots of examples of programmers surprised when code that ran just fine on x86 fails on ARM or SPARC platforms, for example.

Upvotes: 1

Related Questions