PaeneInsula
PaeneInsula

Reputation: 2100

How convert very long string to double in portable C

I want to convert a very long string of numbers to a double in a portable way in C. In my case, portable means that it would work in Linux and Windows. My ultimate goal is to be able to pack a string of numbers into an 8-byte double and fwrite/fread to/from a binary file. The number is always unsigned.

I am using this string to pack a 4 digit year, 2 digit month, 2 digit day, 4 digit HH:MM, 1 digit variable, and a 10 digit value. So, trying to pack 23 bytes into 8 bytes.

I have tried all of the standard things:

char myNumAsString[] = "1234567890123456789";

char *ptr;
char dNumString[64];
double dNum;


dNum = atol(myNumAsString);
sprintf(dNumString, "%lf", dNum);

dNum = atof(myNumAsString);
sprintf(dNumString, "%lf", dNum);

dNum = strtod(myNumAsString, &ptr);
sprintf(dNumString, "%lf", dNum);

sscanf(myNumAsString, "%lf", &dNum);
sprintf(dNumString, "%lf", dNum);

And none of these work; they all round off the last several numbers. Any portable way to do this?

Upvotes: 1

Views: 343

Answers (2)

Serge Ballesta
Serge Ballesta

Reputation: 148975

You can save some bits as soon as you know the numbers can not be any value.

  • HH:MM : 0<=HH<=23 <32 : 5 bits, 0 <= MM <= 59 <64 : 6 bits
  • DD : 1 <= DD <= 31 < 32 : 5 bits
  • mm (month) : 1 <= mm <= 12 < 16 : 4 bits

So instead or 8 bytes you only need 20 bits that is less than 3 bytes.

  • YYYY : do you really need to accept any year between 0 and 9999 ??? If you could limit the interesting part to just 2 centuries, 8 bits would be enough.

So a full date could stand in as little as 4 bytes instead of 12.

But if you want to add to that a 10 digit number + 1 variable, that would not stand in the 4 remaining bytes because the greatest uint32_t is 4294967295 enough for any 9 digit number and about half of 10 digit numbers.

If 32 years were enough, you could represent up to 34359738360 that is 10 digits and a variable taking values 0 1 or 2

Lets see that more precisely; the transformations would be:

uint64_t timestamp;
uint8_t minute(uint64_t timestamp) { return timestamp & 0x3f; }
uint8_t hour(uint64_t timestamp) { return (timestamp >> 6) & 0x1f; }
uint8_t day(uint64_t timestamp) { return (timestamp >> 11) & 0x1f; }
uint8_t month(uint64_t timestamp) { return (timestamp >> 16) & 0x1f; }
uint8_t year(uint64_t timestamp) { return orig_year + ((timestamp >> 20) & 0x3f); } // max 64 years
uint64_t ten_digits(uint64_t timestamp) { return orig_year + ((timestamp >> 26) & 0x7FFFFFFFF); }
uint8_t var(uint64_t timestamp) { return (timestamp >> 61) & 0x7); } // 8 values for the one digit variable

If you can accept only 4 values for the one digit variable, end part becomes:

uint8_t year(uint64_t timestamp) { return orig_year + ((timestamp >> 20) & 0x7f); } // max 128 years
uint64_t ten_digits(uint64_t timestamp) { return orig_year + ((timestamp >> 27) & 0x7FFFFFFFF); }
uint8_t var(uint64_t timestamp) { return (timestamp >> 61) & 0x3); } // 4 values for the one digit variable

You could even save some bits if you computed an absolute number of minutes since an epoch, but computations would be much more complexes.

Upvotes: 0

chux
chux

Reputation: 153592

Take advantage that part of the string is a timestamp and not any set of digits.

With 60 minutes, 24 hours, 365.25 days/year, y years, a digit and 10 digits, there are 60*24*365.25*y*10*pow(10,10) combinations or about 5.3e16 * y

An 8-byte, 64-bit number has 1.8e19 combinations. So if the range of years is 350 or less (like 1970 to 2320), things will fit.

Assuming unix timestamp, and OP can convert a time string to time_t (check out mktime()) ....

time_t epoch = 0;  // Jan 1, 1970, Adjust as needed.

uint64_t pack(time_t t, int digit1, unsigned long long digit10) {
  uint64_t pack = digit1 * 10000000000 + digit10;
  time_t tminutes = (t - epoch)/60;

  pack += tminutes*100000000000;
  return pack;
}

Reverse to unpack.


Or a more complete portable packing (code untested)

#include <time.h>
// pack 19 digit string
// "YYYYMMDDHHmm11234567890"
uint64_t pack(const char *s) {
  struct tm tm0 = {0};
  tm0.tm_year = 1970 - 1900;
  tm0.tm_mon = 1-1;
  tm0.tm_mday = 1;
  tm0.tm_isdst = -1;
  time_t t0 = mktime(&tm0);  // t0 will be 0 on a Unix system
  struct tm tm = {0};
  char sentinal;
  int digit1;
  unsigned long long digit10;
  if (strlen(s) != 4+2+2+2+2+1+10) return -1;
  if (7 != sscanf(s, "%4d%2d%2d%2d%2d%1d%10llu%c", &tm.tm_year,
          &tm.tm_mon, &tm.tm_mday, &tm.tm_hour, &tm.tm_min,
          &digit1, &digit10, &sentinal)) return -1;
  tm.tm_year -= 1900;
  tm.tm_mon--;
  tm.tm_isdst = -1;
  time_t t = mktime(&tm);

  double diff_sec = difftime(t, t0);
  unsigned long long diff_min= diff_sec/60;
  return diff_min * 100000000000 + digit1*10000000000ull + digit10;
}

Upvotes: 2

Related Questions