Lucas Barrot
Lucas Barrot

Reputation: 11

Float32 to float16 conversion in python and how it impacts np.isclose's atol

I am trying to compare each layer outputs of two ML models trained with different libs (tensorflow and a custom lighter one) that have the same valid performance in some sort of a "unitary test" and I have big differences in testing between float32 training and float16 training, even though the predicting performances are very similar.

When training the model with float32 weights, I use np.isclose with 1e-08 atol to compare the layer outputs and i get very minimal errors, but when I train the model with float16 weights, I get much bigger errors, even though the two models have the same performance and I increased the atol to 1e-04.

I am wondering if just increasing the atol to 1e-04 is the right approach, as the np.isclose layer output comparison of the two models (even though they have the same predicting performance) gives a lot more error in float16 than in float32.

EDIT : To add more context I made a small script to reproduce the "error" i get that I don't understand. Here is an example with a dense layer with the two different implementations (keras and the custom lib) :

import keras.backend
import numpy as np
import keras

def custom_dense_layer(_input_data, weights, biases) -> np.ndarray:
    _layer_output = np.matmul(_input_data, weights) + biases  # weight matrix = [n1_weights, n2_weights, ..]. Thus computing (X.W + B) instead of (transpose(W).transpose(X) + transpose(B))
    return _layer_output


if __name__ == '__main__' :
    bit_precision=16  # used to set the computation and weights / biases dtypes, here the goal is to compare between 16 and 32
    keras.backend.set_floatx(f'float{bit_precision}') # set computation dtype
    weights=np.random.randn(16,16).astype(f'float{bit_precision}') # generate random weights
    biases=np.random.randn(16).astype(f'float{bit_precision}') # generate random biases
    input = keras.Input(shape=(16, ),dtype=f'float{bit_precision}') # set input shape
    input_data = np.random.randn(1000,16).astype(f'float{bit_precision}') # generate random data
    keras_dense = keras.layers.Dense(units=16) # generate keras dense layer
    keras_dense(input)
    keras_dense.set_weights([weights,biases]) # set the weights and biases
    custom_op = custom_dense_layer(_input_data=input_data, weights=weights, biases=biases) # custom dense layer prediction calculation
    keras_op = np.array(keras_dense(input_data)) # keras prediction calculation
    isclose = np.isclose(keras_op, custom_op,atol=10**-(bit_precision/4),rtol=(2**-(bit_precision/2))).all(axis=1) # checking results similarity levels
    failed_points = np.where(~np.isclose(keras_op, custom_op,atol=10**-(bit_precision/4),rtol=(2**-(bit_precision/2))).all(axis=1))[0]# checking results that are not close enough
    perc_failed = round(100 * failed_points.size / isclose.size, 3) # "failed" results percentage
    print(perc_failed)

Even though when I use float 16 I try to increase the tolerance proportionnaly to the bit reduction, I get 60+% error rate when I only get ~2% when using float32.

Upvotes: 1

Views: 150

Answers (2)

Jack Bosco
Jack Bosco

Reputation: 71

If I understand the question correctly, you are attempting to check float32 datatypes with float16 datatypes for equality using np.isclose().

One solution I can think of is using relative tolerance in addition to absolute tolerance in the parameters of the np.isclose() function. Relative tolerance multiplies the term on the right by some coefficient to get the absolute difference of the left term in the context of the right term. Source: https://numpy.org/doc/stable/reference/generated/numpy.isclose.html

For example, if I were to make a 32-bit float s32 larger than a 16-bit float s16 by a fraction so small it couldn't be captured in 16-bit precision:

import numpy as np

s16 = np.float16(2)
s32 = np.float32(2 + 2**(-15))

u16 = np.array(s16, dtype=np.float16).view(dtype=np.uint16)
u32 = np.array(s32, dtype=np.float32).view(dtype=np.uint32)

print('32:\t' + bin(u32))
print('16:\t' + bin(u16))

It outputs:

32:     0b1000000000000000000000010000000
16:     0b100000000000000

So the difference cannot be represented in 16 bits. Checking for equality with no relative tolerance evaluates to False:

>>> print('Equality:\t', np.isclose(s16, s32, atol=(1e-8)))
Equality:        False

But with some relative equality, set to 2**(-16) to account for the precision difference:

>>> print('Equality:\t', np.isclose(s16, s32, atol=(1e-8), rtol=(2**(-16))))
Equality:        True

So, to answer your question, using extra relative tolerance to account for the difference in precision is a good mathematical justification for setting the tolerance of the equality checker. Hopefully this gives you more accurate results when testing the ML model, but without more information about the custom library you are using (i.e. if it uses numpy for floats), this is the best advice I can give you.

Upvotes: 1

jsbueno
jsbueno

Reputation: 110186

you should be using rtol instead of atol for comparing numbers. And then, 1e-04 is olose the resolution for the mantissa of float16 (1/1024) ( https://en.wikipedia.org/wiki/Bfloat16_floating-point_format ) - roughly, it means that an atol of 1e-4 will allow for one bit difference when working close to the unit scale (i.e.: exponents near 0). So, yes, it does not seem like using any better precision than that would be useful for float16.

Upvotes: 0

Related Questions