Zciurus
Zciurus

Reputation: 838

How can I convert this number representation to a float?

I read this 16-bit value from a temperature sensor (type MCP9808)

Ignoring the first three MSBs, what's an easy way to convert the other bits to a float? I managed to convert the values 2^7 through 2^0 to an integer with some bit-shifting:

uint16_t rawBits = readSensor();
int16_t value = (rawBits << 3) / 128;

However I can't think of an easy way to also include the bits with an exponent smaller than 0, except for manually checking if they're set and then adding 1/2, 1/4, 1/8 and 1/16 to the result respectively.

Upvotes: 2

Views: 664

Answers (4)

Zilog80
Zilog80

Reputation: 2562

If your C compiler has a clz buitlin or equivalent, it could be useful to avoid mul operation. In your case, as the provided temp value looks like a mantissa and if your C compiler uses IEEE-754 float representation, translating the temp value in its IEEE-754 equivalent may be a most efficient way :

Update: Compact the code a little and more clear explanation about the mantissa.

float convert(uint16_t val) {
  uint16_t mantissa = (uint16_t)(val <<4);
  if (mantissa==0) return 0.0;
  unsigned char e =  (unsigned char)(__builtin_clz(mantissa) - 16);
  uint32_t r = (uint32_t)((val & 0x1000) << 19 | (0x86 - e) << 23 | ((mantissa << (e+8)) & 0x07FFFFF));
  return *((float *)(&r));
}

or

float convert(unsigned char msb, unsigned char lsb) {
  uint16_t mantissa = (uint16_t)((msb<<8 | lsb) <<4);
  if (mantissa==0) return 0.0;
  unsigned char e =  (unsigned char)(__builtin_clz(mantissa) - 16);
  uint32_t r = (uint32_t)((msb & 0x10) << 27 | (0x86 - e) << 23 | ((mantissa << (e+8)) & 0x07FFFFF));
  return *((float *)(&r));
}

Explanation:

We use the fact that the temp value is somehow a mantissa in the range -255 to 255. We can then consider that its IEEE-754 exponent will be 128 at max to -128 at min. We use the clz buitlin to get the "order" of the first bit set in the mantissa, this way we can define the exponent as the therorical max (2^7 =>128) less this "order". We use also this order to left shift the temp value to get the IEEE-754 mantissa, plus one left shift to substract the '1' implied part of the significand for IEEE-754. Thus we build a 32 bits binary IEEE-754 representation from the temp value with :

  1. At first the sign bit to the 32th bit of our binary IEEE-754 representation.
  2. The computed exponent as the theorical max 7 (2^7 =>128) plus the IEEE-754 bias (127) minus the actual "order" of the temp value. The "order" of the temp value is deducted from the number of leading '0' of its 12 bits representation in the variable mantissa through the clz builtin. Beware that here we consider that the clz builtin is expecting a 32 bit value as parameter, that is why we substract 16 here. This code may require adaptation if your clz expects anything else. The number of leading '0' can go from 0 (temp value above 128 or under -127) to 11 as we directly return 0.0 for a zero temp value. As the following bit of the "order" is then 1 in the temp value, it is equivalent to a power of 2 reduction from the theorical max 7. Thus, with 7 + 127 => 0x86, we can simply substract to that the "order" as the number of leading '0' permits us to deduce the 'first' base exponent for IEEE-754. If the "order" is greater than 7 we will still get the negative exponent required for less than 1 values. We add then this 8bits exponent to our binary IEEE-754 representation from 24th bit to 31th bit.
  3. The temp value is somehow already a mantissa, we suppress the leading '0' and its first bit set by shifting it to the left (e + 1) while also shifting left for 7 bits to place the mantissa in the 32 bits (e+7+1 => e+8) . We mask then only the desired 23 bits (AND &0x7FFFFF). Its first bit set must be removed as it is the '1' implied significand in IEEE-754 (the power of 2 of the exponent). We have then the IEEEE-754 mantissa and place it from the 8th bit to the 23th bit of our binary IEEE-754 representation. The 4 initial trailing 0 from our 16 bits temp value and the added seven 'right' 0 from the shifting won't change the effective IEEE-754 value. As we start from a 32 bits value and use or operator (|) on a 32 bits exponent and mantissa, we have then the final IEEE-754 representation.
  4. We can then return this binary representation as an IEEE-754 C float value.

Due to the required clz and the IEEE-754 translation, this way is less portable. The main interest is to avoid MUL operations in the resulting machine code for performance on arch with a "poor" FPU.

P.S.: Casts explanation. I've added explicit casts to let the C compiler know that we discard voluntary some bits :

  • uint16_t mantissa = (uint16_t)(val <<4); : The cast here tells the compiler that we know we'll "loose" four left bits, as it the goal here. We discard the four first bits of the temp value for the mantissa.
  • (unsigned char)(__builtin_clz(mantissa) - 16) : We tell to the C compiler that we will only consider a 8 bits range for the builtin return, as we know our mantissa has only 12 significatives bits and thus a range output from 0 to 12. Thus we do not need the full int return.
  • uint32_t r = (uint32_t) ... : We tell the C compiler to not bother with the sign representation here as we build an IEEE-754 representation.

Upvotes: 0

Alexander
Alexander

Reputation: 63271

If performance isn't a super big deal, I would go for something less clever and more explcit, along the lines of:

bool is_bit_set(uint16_t value, uint16_t bit) {
    uint16_t mask = 1 << bit;
    return (value & mask) == mask;
}

float parse_temperature(uint16_t raw_reading) {
    if (is_bit_set(raw_reading, 15)) { /* temp is above Tcrit. Do something about it. */ }
    if (is_bit_set(raw_reading, 14)) { /* temp is above Tupper. Do something about it. */ }
    if (is_bit_set(raw_reading, 13)) { /* temp is above Tlower. Do something about it. */ }


    uint16_t whole_degrees = (raw_reading & 0x0FF0) >> 4;

    float magnitude = (float) whole_degrees;
    if (is_bit_set(raw_reading, 0)) magnitude += 1.0f/16.0f;
    if (is_bit_set(raw_reading, 1)) magnitude += 1.0f/8.0f;
    if (is_bit_set(raw_reading, 2)) magnitude += 1.0f/4.0f;
    if (is_bit_set(raw_reading, 3)) magnitude += 1.0f/2.0f;
    
    bool is_negative = is_bit_set(raw_reading, 12);
    
    // TODO: What do the 3 most significant bits do?

    return magnitude * (is_negative ? -1.0 : 1.0);
}

Honestly this is a lot of simple constant math, I'd be surprised if the compiler wasn't able to heavily optimize it. That would need confirmation, of course.

Upvotes: 0

0___________
0___________

Reputation: 67546

float convert(unsigned char msb, unsigned char lsb)
{
    return ((lsb | ((msb & 0x0f) << 8)) * ((msb & 0x10) ? -1 : 1)) / 16.0f;
}

or

float convert(uint16_t val)
{
    return (((val & 0x1000) ? -1 : 1) * (val << 4)) / 256.0f;
}

Upvotes: 2

Bill Lynch
Bill Lynch

Reputation: 81936

Something like this seems pretty reasonable. Take the number portion, divide by 16, and fix the sign.

float tempSensor(uint16_t value) {
  bool negative = (value & 0x1000);
  return (negative ? -1 : 1) * (value & 0x0FFF) / 16.0f;
}

Upvotes: 4

Related Questions