Node.JS
Node.JS

Reputation: 1578

Addition of 16-bit Floating point Numbers and How to convert it back to decimal

I want to Convert the following two numbers into IEEE Floating Point Standard (FPS) modified (16 bits) by changing the 23 bit fractional part to a 7 bit fractional part, and add them up. But I don't know whether I have done it correctly and how to convert the result back to decimal to get approximately 28.625

Numbers in parenthesis are hidden bits (Cause we convert number e.g 3.5 into 11.1*2^0 then 1.11 *2^1 so we omit that leftmost '1' and we call it hidden bit)

enter image description here

Upvotes: 1

Views: 3187

Answers (1)

Eric Postpischil
Eric Postpischil

Reputation: 222900

Numbers are not added by adding their significands while their exponents are different. That is like trying to add 25.25 to 3.375 by adding 2525 to 3375. It does not work. You must align the corresponding bits by shifting them and adjusting the exponents accordingly. If you had these two numbers:

  • 1.10010102•24
  • 1.10110002•21

Then you would adjust change the smaller number, giving this pair:

  • 1.10010102•24 (same as original)
  • 0.00110112•24 (shifted three bits right and added three to exponent)

Then you add them:

  • 1.11001012•24

Then you can round that number if necessary and convert it to another format.

Upvotes: 3

Related Questions