Reputation: 3670
To reduce the filesize, I'm trying to save float64
data to file in float32
. The data values generally range from 1e-12 to 10. I tested the accuracy loss when converting float64
to float32
.
print np.finfo('float32')
shows
Machine parameters for float32
---------------------------------------------------------------
precision= 6 resolution= 1.0000000e-06
machep= -23 eps= 1.1920929e-07
negep = -24 epsneg= 5.9604645e-08
minexp= -126 tiny= 1.1754944e-38
maxexp= 128 max= 3.4028235e+38
nexp = 8 min= -max
---------------------------------------------------------------
Looks float32
has a resolution of 1e-6
and the abs value is valid down to as small as 1.2e-38
.
import numpy as np
x = 2.0*np.random.rand(100) - 1.0 # make random numbers in [-1, 1]
print('x.dtype: %s'%(x.dtype)) # outputs float64
print('number : max_error max_relative_error')
for i in xrange(-40, 1):
y = x * 10**i
print('1e%-4d: %s'%(i, np.max(np.abs(y - y.astype('f4').astype('f8')))))
The results are
number: max_error max_relative_error
1e-40 : 6.915620e-46 6.915620e-06
1e-39 : 6.910361e-46 6.910361e-07
1e-38 : 6.949349e-46 6.949349e-08
1e-37 : 4.816590e-45 4.816590e-08
1e-36 : 4.303771e-44 4.303771e-08
1e-35 : 3.518621e-43 3.518621e-08
1e-34 : 5.165854e-42 5.165854e-08
1e-33 : 3.660088e-41 3.660088e-08
1e-32 : 3.660088e-40 3.660088e-08
1e-31 : 4.097193e-39 4.097193e-08
1e-30 : 4.615068e-38 4.615068e-08
1e-29 : 3.696983e-37 3.696983e-08
1e-28 : 2.999860e-36 2.999860e-08
1e-27 : 4.723454e-35 4.723454e-08
1e-26 : 3.801082e-34 3.801082e-08
1e-25 : 3.062408e-33 3.062408e-08
1e-24 : 4.876378e-32 4.876378e-08
1e-23 : 3.779378e-31 3.779378e-08
1e-22 : 3.144592e-30 3.144592e-08
1e-21 : 4.991049e-29 4.991049e-08
1e-20 : 3.949261e-28 3.949261e-08
1e-19 : 3.002761e-27 3.002761e-08
1e-18 : 5.162480e-26 5.162480e-08
1e-17 : 4.135703e-25 4.135703e-08
1e-16 : 3.282146e-24 3.282146e-08
1e-15 : 4.722129e-23 4.722129e-08
1e-14 : 3.863295e-22 3.863295e-08
1e-13 : 3.375549e-21 3.375549e-08
1e-12 : 4.011790e-20 4.011790e-08
1e-11 : 4.011790e-19 4.011790e-08
1e-10 : 3.392060e-18 3.392060e-08
1e-9 : 5.471206e-17 5.471206e-08
1e-8 : 4.072652e-16 4.072652e-08
1e-7 : 3.496987e-15 3.496987e-08
1e-6 : 5.662626e-14 5.662626e-08
1e-5 : 4.412957e-13 4.412957e-08
1e-4 : 3.482083e-12 3.482083e-08
1e-3 : 5.597344e-11 5.597344e-08
1e-2 : 4.620014e-10 4.620014e-08
1e-1 : 3.540690e-09 3.540690e-08
1e0 : 2.817751e-08 2.817751e-08
The relative error is at the order of 1e-8
for values above 1e-38, lower than 1e-6
proposed by np.finfo
and the error is still acceptable even if the value if lower than the tiny
value of np.finfo
.
Looks it's safe to save my data in float32
, but I'm curious about the test looks inconsistent with the results of np.finfo
?
Upvotes: 4
Views: 4613
Reputation: 416
Since machine floating point epsilon is 1.1920929e-07, rounding would get you relative error within half of that for normal floats: 5.9604645e-8. However, when you get smaller than 1.1754944e-38, you have denormalized numbers, which instead have an absolute error of 1.4012985e-45.
Upvotes: 1
Reputation: 86124
Numbers that low are in the subnormal range. Basically, the exponent doesn't have enough range to get sufficiently low, so you're gradually losing significant bits as values get lower. This is called "gradual underflow".
https://en.wikipedia.org/wiki/Denormal_number
Upvotes: 6