Reputation: 922
I'm trying to interpret WebSocket Frames that I get over a TCP connection. I want to do this in pure C (so no reinterpret_cast). The Format is specified in IEEE RFC 6455. I want to fill the following struct:
typedef struct {
uint8_t flags;
uint8_t opcode;
uint8_t isMasked;
uint64_t payloadLength;
uint32_t maskingKey;
char* payloadData;
} WSFrame;
with the following Function:
static void parseWsFrame(char *data, WSFrame *frame) {
frame->flags = (*data) & FLAGS_MASK;
frame->opcode = (*data) & OPCODE_MASK;
//next byte
data += 1;
frame->isMasked = (*data) & IS_MASKED;
frame->payloadLength = (*data) & PAYLOAD_MASK;
//next byte
data += 1;
if (frame->payloadLength == 126) {
frame->payloadLength = *((uint16_t *)data);
data += 2;
} else if (frame->payloadLength == 127) {
frame->payloadLength = *((uint64_t *)data);
data += 8;
}
if (frame->isMasked) {
frame->maskingKey = *((uint32_t *)data);
data += 4;
}else{
//still need to initialize it to shut up the compiler
frame->maskingKey = 0;
}
frame->payloadData = data;
}
The code is for the ESP8266, so debugging is only possible with printfs to the serial console. Using this method, I discovered that the code crashes right after the frame->maskingKey = *((uint32_t *)data);
and the first two ifs get skipped, so this is the first time I cast a pointer to another pointer.
The data is not \0
terminated, but i get the size in the data received callback. In my test, I'm trying to send the message 'test' over the already established WebSocket, and the received data length is 10, so:
At the point the code crashes, I expect data to be offsetted by 2 bytes from the initial position, so it has enough data to read the following 4 bytes.
I did not code any C for a long time, so I expect only a small error in my code.
PS.: I've seen a lot code where they interpret the values byte-by-byte and shift the values, but I see no reason why this method should not work either.
Upvotes: 0
Views: 179
Reputation: 141628
There's two problems:
data
might not be correctly aligned for uint32_t
data
might not be in the same order as your hardware uses for value representation of integer. (sometimes called "endianness issue").To write reliable code, look at the message specification to see which order the bytes are coming in. If they are most-significant-byte first then the portable version of your code would be:
unsigned char *udata = (unsigned char *)data;
frame->maskingKey = udata[0] * 0x1000000ul
+ udata[1] * 0x10000ul
+ udata[2] * 0x100ul
+ udata[3];
This might look like a handful at first, but you could make an inline function that takes a pointer as argument, and returns the uint32_t
, which will keep your code readable.
Similar problem applies to your reads of uint16_t
.
Upvotes: 0
Reputation: 2899
The problem with casting a char* to a pointer to a larger type is that some architectures do not allow unaligned reads.
That is, for example, if you try to read a uint32_t through a pointer, then the value of the pointer itself has to be a multiple of 4. Otherwise, on some architectures, you will get a bus fault (e.g. - signal, trap, exception, etc.) of some sort.
Because this data is coming in over TCP and the format of the stream / protocol is laid out without any padding, then you will likely need to read it out from the buffer into local variables byte by byte (e.g. - using memcpy) as appropriate. For example:
if (frame->isMasked) {
mempcy(&frame->maskingKey, data, 4);
data += 4;
// TODO: handle endianness: e.g.: frame->maskingKey = ntohl(frame->maskingKey);
}else{
//still need to initialize it to shut up the compiler
frame->maskingKey = 0;
}
Upvotes: 2