pudiva
pudiva

Reputation: 274

Data serialization in C?

I have this structure which I want to write to a file:

typedef struct
{
    char* egg;
    unsigned long sausage;
    long bacon;
    double spam;
} order;

This file must be binary and must be readable by any machine that has a C99 compiler.

I looked at various approaches to this matter such as ASN.1, XDR, XML, ProtocolBuffers and many others, but none of them fit my requirements:

I decided then to make my own data protocol. I could handle the following representations of integer types:

in a valid, simple and clean way (impressive, no?). However, the real types are being a pain now.

How should I read float and double from a byte stream? The standard says that bitwise operators (at least &, |, << and >>) are for integer types only, which left me without hope. The only way I could think was:

int sign;
int exponent;
unsigned long mantissa;

order my_order;

sign = read_sign();
exponent = read_exponent();
mantissa = read_mantissa();

my_order.spam = sign * mantissa * pow(10, exponent);

but that doesn't seem really efficient. I also could not find a description of the representation of double and float. How should one proceed before this?

Upvotes: 2

Views: 1854

Answers (5)

Nils Pipenbrinck
Nils Pipenbrinck

Reputation: 86363

If you want to be as portable as possible with floats you can use frexp and ldexp:

void WriteFloat (float number)
{
  int exponent;
  unsigned long mantissa;

  mantissa = (unsigned int) (INT_MAX * frexp(number, &exponent);

  WriteInt (exponent);
  WriteUnsigned (mantissa);
}

float ReadFloat ()
{
  int exponent = ReadInt();
  unsigned long mantissa = ReadUnsigned();

  float value = (float)mantissa / INT_MAX;

  return ldexp (value, exponent);
}

The Idea behind this is, that ldexp, frexp and INT_MAX are standard C. Also the precision of an unsigned long is usually at least as high as the width of the mantissa (no guarantee, but it is a valid assumption and I don't know a single architecture that is different here).

Therefore the conversion works without precision loss. The division/multiplication with INT_MAX may loose a bit of precision during conversion, but that's a compromise one can live with.

Upvotes: 3

u0b34a0f6ae
u0b34a0f6ae

Reputation: 49813

This answer uses Nils Pipenbrinck's method but I have changed a few details that I think help to ensure real C99 portability. This solution lives in an imaginary context where encode_int64 and encode_int32 etc already exist.

#include <stdint.h>     
#include <math.h>                                                         

#define PORTABLE_INTLEAST64_MAX ((int_least64_t)9223372036854775807) /* 2^63-1*/             

/* NOTE: +-inf and nan not handled. quickest solution                            
 * is to encode 0 for !isfinite(val) */                                          
void encode_double(struct encoder *rec, double val) {                            
    int exp = 0;                                                                 
    double norm = frexp(val, &exp);                                              
    int_least64_t scale = norm*PORTABLE_INTLEAST64_MAX;                          
    encode_int64(rec, scale);                                                    
    encode_int32(rec, exp);                                                      
}                                                                                

void decode_double(struct encoder *rec, double *val) {                           
    int_least64_t scale = 0;                                                     
    int_least32_t exp = 0;                                                       
    decode_int64(rec, &scale);                                                   
    decode_int32(rec, &exp);                                                     
    *val = ldexp((double)scale/PORTABLE_INTLEAST64_MAX, exp);                    
}

This is still not a real solution, inf and nan can not be encoded. Also notice that both parts of the double carry sign bits.

int_least64_t is guaranteed by the standard (int64_t is not), and we use the least perimissible maximum for this type to scale the double. The encoding routines accept int_least64_t but will have to reject input that is larger than 64 bits for portability, the same for the 32 bit case.

Upvotes: 2

Jason
Jason

Reputation: 32510

If you are using IEEE-754 why not access the float or double as a unsigned short or unsigned long and save the floating point data as a series of bytes, then re-convert the "specialized" unsigned short or unsigned long back to a float or double on the other side of the transmission ... the bit-data would be preserved, so you should end-up with the same floating point number after transmission.

Upvotes: 2

Anomie
Anomie

Reputation: 94804

The C standard doesn't define a representation for floating point types. Your best bet would be to convert them to IEEE-754 format and store them that way. Portability of binary serialization of double/float type in C++ may help you there.

Note that the C standard also doesn't specify a format for integers. While most computers you're likely to encounter will use a normal two's-complement representation with only endianness to be concerned about, it's also possible they would use a one's-complement or sign-magnitude representation, and both signed and unsigned ints may contain padding bits that don't contribute to the value.

Upvotes: 1

lhf
lhf

Reputation: 72312

If you are using C99 you can output real numbers in portable hex using %a.

Upvotes: 2

Related Questions