What does 'quantization' mean in interpreter.get_input_details()?

Using tflite and getting properties of interpreter like :

print(interpreter.get_input_details())

[{'name': 'input_1_1', 'index': 47, 'shape': array([  1, 128, 128,   3], dtype=int32), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.003921568859368563, 0)}]

What does 'quantization': (0.003921568859368563, 0) mean?

Upvotes: 5

Answers (2)

Ben Butterworth

Reputation: 28818

Unfortunately the documentation of get_input_details doesn't explain:

Returns: A list of input details.

But if you look the source code get_input_details, it calls _get_tensor_details (source), and this function does document it:

    """Gets tensor details.
    Args:
      tensor_index: Tensor index of tensor to query.
    Returns:
      A dictionary containing the following fields of the tensor:
        'name': The tensor name.
        'index': The tensor index in the interpreter.
        'shape': The shape of the tensor.
        'quantization': Deprecated, use 'quantization_parameters'. This field
            only works for per-tensor quantization, whereas
            'quantization_parameters' works in all cases.
        'quantization_parameters': The parameters used to quantize the tensor:
          'scales': List of scales (one if per-tensor quantization)
          'zero_points': List of zero_points (one if per-tensor quantization)
          'quantized_dimension': Specifies the dimension of per-axis
              quantization, in the case of multiple scales/zero_points.

What does it mean?

These quantization parameters are values used to quantize (convert a range of numbers from one range to another more limited range, e.g. 0-10 into 0-1). In TensorFlow, this is specifically used to mean when the data type changes to a data type which supports fewer numbers: e.g. float32 to float16, or float32 to uint8, or float16 to int8. Dequantization is the reverse (e.g. when you want to get probabilities out of a model which was quantized to uint8 and the quantized output is between 0-255).

The maths is quite simple, like a more general form normalization (making something range from (0 to 1):

quantization: q = (f / s) + z
dequantization: f = (q - z) * s
For more on this quantization equation, see the Quantization Specification.

Note: Aleksandr Kondratyev's equation f = (q - zero_point) * scale is actually dequantization, since it takes q (quantized value) and provides you f (float). Of course you can reverse the equation to get the other one.

Upvotes: 1

Aleksandr Kondratyev

Reputation: 501

It means quantization parameters values: scale and zero_point of input tensor.

This is necessary to convert a quantized uint8 number q to floating point number f using formula:

f = (q - zero_point) * scale

Upvotes: 5

What does &#39;quantization&#39; mean in interpreter.get_input_details()?

Answers (2)

What does it mean?

Related Questions

What does 'quantization' mean in interpreter.get_input_details()?