Reputation: 79218
Wondering how a low-level implementation of parseFloat
such as how it works in JavaScript would be implemented.
All the examples I've seen of typecasting resort to using it at some point, such as this, this, or this. On the other hand, there is this file which is quite large (from here).
Wondering if it is just a very complicated function or there is a straightforward implementation. Wondering just generally how it works if it is too complicated.
Perhaps this is closer to it.
Upvotes: 0
Views: 1021
Reputation: 222362
The essential mathematics of parseFloat
is very simple, requiring no more than elementary-school arithmetic. If we have a decimal numeral, we can easily convert it to binary by:
To determine a floating-point number, we need as many bits as fit in the significand (starting with the leading 1 bit from the binary numeral), and, for rounding purposes, we need to know the next bit and whether any bits after that are non-zero. (The information about the extra bits tells us whether the difference between the source value and the bits that fit in the significand is zero, not zero but less than 1/2 of the lowest bit that fits, exactly 1/2 of the lowest bit, or more than 1/2 of the lowest bit. This information is enough to decide whether to round up or down in any of the usual rounding modes.)
The information above tells you when to stop multiplying in the second part of the algorithm. As soon as you have all the significand bits, plus one more, plus you have either one non-zero bit or the sub-integer part is zero, you have all the information you need and can stop.
Then you construct a floating-point value by rounding the bits according to whatever rounding rule you are using (often round-to-nearest-ties-to-even), putting the bits into the significand of a floating-point object, and setting the exponent to record the position of the leading bit of the binary numeral.
There are some embellishments for checking for overflow or underflow or handling subnormal values. However, the basic arithmetic is simply elementary-school arithmetic.
Problems arise because the above uses arbitrary-size arrays and because it does not support scientific notation where an “e” is used to introduce a decimal exponent, as in “2.79e34”. The above algorithm requires that we maintain all the space needed to multiply and divide decimal numerals of any length given to us. Usually, we do not want to do that, and we also want faster algorithms. Note that supporting scientific notation with the above algorithm would also require arbitrary-size arrays. To fill out the decimal numeral for “2.79e34”, we have to fill an array with “27900000000000000000000000000000000”.
So algorithms are developed to do the conversion in smarter ways. Instead of doing exact calculations, we may do precise calculations but carefully analyze the errors produced to ensure they are too small to prevent us from getting the right answer. Also, data may be prepared in advance, such as tables with information about powers of ten, so that we have approximate values of powers of ten already in binary without having to compute them each time a conversion is performed.
The complications of converting decimal to binary floating-point arise out of this desire for algorithms that are fast and use limited resources. Allowing some errors causes a need for mathematical proofs to ensure the computations are correct, and trying to make the routines fast and resource-efficient lead people to think of clever techniques to use, which become tricky and require proof.
Upvotes: 2