RAs
RAs

Reputation: 387

Sum of two 'np.longdouble' yielding numerical error unrelated to printing

The issue I am trying to both understand and solve relates to the one asked years ago in this question: Sum of two "np.longdouble"s yields big numerical error, but is unrelated to the mere printing of values.

In Python's numpy library, suppose we create two long doubles like the following:

a = np.longdouble('4')
b = np.longdouble('1e-3000')

As expected, if one inspects type(a) or type(b), the result is that both are of type numpy.float128.

What I want to do is to simply sum those values stored in a and b, but simply summing them results in the number 4:

In [3]: a + b
Out[3]: 4.0

In [4]: (a + b) == np.longdouble(4)
Out[4]: True

In [5]: (a + b) == np.longdouble('4')
Out[5]: True

In [6]: (a + b) == np.longdouble('4.0')
Out[6]: True

In [7]: (a + b) > np.longdouble('4.0')
Out[7]: False

In [8]: np.equal(a + b,np.longdouble('4.0'))
Out[8]: True

In [9]: np.greater(a + b,np.longdouble('4.0'))
Out[9]: False

In [10]: type(a + b)
Out[10]: numpy.float128

As I think the tests above imply, the sum between a and b is being collapsed to actually be equal to 4, regardless of such a sum still being store in a float128 object.

Notice that the same does not happen with multiplication or division:

In [11]: a * b
Out[11]: 4e-300

In [12]: a / b
Out[12]: 4e+300

In [13]: b / a
Out[13]: 2.5e-301

Although the same happens with subtraction:

In [14]: np.equal(a - b,np.longdouble('4.0'))
Out[14]: True

In [15]: np.equal(b - a,np.longdouble('-4.0'))
Out[15]: True

In [16]: (a - b) == np.longdouble('4.0')
Out[16]: True

In [17]: (b - a) == np.longdouble('-4.0')
Out[17]: True

Hence, my questions: why are summations and subtractions not working as intended and how could I have a + b in the above examples result in the number:

4.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001

That is, in a number whose logical comparison for equality against the number 4 results in False.

Obs: in case it makes any difference:

In [18]: np.nextafter(np.longdouble(0),1)
Out[18]: 4e-4951

Upvotes: 1

Views: 150

Answers (2)

kvantour
kvantour

Reputation: 26481

Imagine you meet a person that can remember any number up to 5 digits long and any given exponent. So you ask the person to compute the following sum

4.0 + 1.0E-4

So that person takes out a pen and paper and starts doing the work:

  4.0000
  0.0001
+ ------
  4.0001

And tells you that the answer is 4.0001. You are completely baffled as the answer is correct. You want to test him further and ask the result of:

4.0 + 1.0E-5

And the person tells you that it is 4. Now you are completely stupefied as you know that if you add something to something else, it must be bigger or smaller than the first something. But then you remember that the person can only remember up to 5 digits long and then you notice:

  4.00000
  0.00001
+ -------
  4.00001

But 4.00001 is 6 digits and the person can only remember 5. So that is why he returns 4.

Now imagine that person is your computer that can only remember binary numbers with 112 digits and exponents between −16382 and 16383, as well as the sign of the number.

Upvotes: 3

aka.nice
aka.nice

Reputation: 9382

10^3 is approximately 2^10, so do you realize that 1e-3000 is about 2^-10,000?

So you add that tiny quantity to 4 (2^2) and don't want the difference to vanish, do you realize that it requires more than ten thousand bits of precision for the significand?

Do you think that a 128bits floating point can hold that many bits?

The fact that tiny number are representable is due to the fact that absolute precision of a floating point is floating... because floating point are represented internally with a scaling (-1)^sign_bit * 2^exponent * significand. But it does not mean that the relative precision is that high!

The number of bits used for representing exponent and significand is fixed for double, longdouble, etc...

The number of bits reserved for the exponent fix the range of representable values.
The number of bits used for the significand fix the relative precision.

These are very basic concepts of floating point and you should definitely read the links on info page https://stackoverflow.com/tags/floating-point/info

Upvotes: 1

Related Questions