Ynv
Ynv

Reputation: 1974

Serializing strings in C

I'm serializing structs into byte-streams. My method is simple: pack all ints in little endian order and copy strings including the null terminator. The other side has to statically know how to unpack the byte-stream, there is no additional metadata.

My problem is, that I do not know how to handle the the NULL pointer?

I need to send something, because there is no additional metadata in the stream.

I considered the following two options:

  1. Send a '\0' and make the receiving side interpret it as NULL in any case

  2. Send a '\0' and make the receiving side interpret it as '\0' in any case (alloc a byte)

  3. Send a special character representing char* str == NULL, e.g. ETX, EOT, EM ?

What do you think?

Upvotes: 2

Views: 2579

Answers (4)

Andrew Rasmussen
Andrew Rasmussen

Reputation: 15109

It looks like you are currently trying to tell the receiving end that the end of the serialized string has been reached by passing it a special character. There are a million cases that can screw you over with this:

What if your struct contains a byte that is equal to that special character. Escape it with another special character. What if your struct contains a byte sequence that is equal to your escape character followed by your special character, check for that too?

Yeah it's doable, but I think that's not a very good solution and you'll have to write a parser to look for the escape character and then anyone who takes a look at the code later will spend two hours trying to figure out what's going on.

(tl;dr) Instead... just make the first 32 bits of the serialized string equal to the number of bytes in the string. This only costs 4 bytes per serialization, solves all your problems, you won't have to write a parser or worry about special characters, and will make it a lot easier on the next guy who gets to read through your code!

edit

Thanks to JeremyP I've just realized that I didn't really answer your question. Send one of these guys for every string:

struct s_str { bool is_null; int size; char* str; };

If it's null, simply set is_null to true and you don't really have to worry about the other two. If it's size zero, set is_null to false and size to zero. If str contains just a '\0', set is_null to false, size to one, and str[0] to '\0'

In my opinion, this might not be the most memory efficient way (you could probably save a byte somewhere somehow) but is definitely quite clear in what you're doing, and again the next guy that comes along will like this a lot more.

Upvotes: 5

Michael F
Michael F

Reputation: 40869

I would advise you to use an existing library for serialization. I can think of two at the moment: tpl and gwlib's gwser.

About tpl:

You can use tpl to store and reload your C data quickly and easily. Tpl works with files, memory buffers and file descriptors so it's suitable for use as a file format, IPC message format or any scenario where you need to store and retrieve your data.

About gwlib, see the link, it's not very verbose, but it provides a few usage examples.

Upvotes: 2

Klas Lindbäck
Klas Lindbäck

Reputation: 33283

It depends on the significance of the pointer in your protocol. If the pointer is significant, i.e. it is needed for the recipient to know how to rebuild the struct, then you need to send something. It could be either a byte with 0/non-zero to indicate existence, or an integer that indicates the number of bytes pointed to by the pointer.

Example: struct Foo { int *arr, char *text }

Struct Foo could be serialized like this:

<arr length><  arr   ><text length>< text >
  4 bytes    n bytes    4 bytes     n bytes

Upvotes: 2

Aftnix
Aftnix

Reputation: 4599

Do not do this. use some extra bytes to store length and concatenate with your data string. The receiver end can check the length to know how much it should read into his local buffer.

Upvotes: 2

Related Questions