Tensorflow quantization: Array output does not have MinMax information

Question

I am attempting to create a Tensorflow quantized model for inference with the Coral USB Accelerator. Here is a minimal standalone example of my issue:

import sys

import tensorflow as tf

CKPT = "a/out.ckpt"
TFLITE = "a/out.tflite"

args = sys.argv[1:]
if 0 == len(args):
    print("Options are 'train' or 'save'")
    exit(-1)
cmd = args[0]

if cmd not in ["train", "save"]:
    print("Options are 'train' or 'save'")
    exit(-1)

tr_in = [[1.0, 0.0], [0.0, 1.0], [0.0, 0.0], [1.0, 1.0]]
tr_out = [[1.0], [1.0], [0.0], [0.0]]

nn_in = tf.placeholder(tf.float32, (None, 2), name="input")

W = tf.Variable(tf.random_normal([2, 1], stddev=0.1))
B = tf.Variable(tf.ones([1]))

nn_out = tf.nn.relu6(tf.matmul(nn_in, W) + B, name="output")

if "train" == cmd:
    tf.contrib.quantize.create_training_graph(quant_delay=0)
    nn_act = tf.placeholder(tf.float32, (None, 1), name="actual")
    diff = tf.reduce_mean(tf.pow(nn_act - nn_out, 2))
    update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
    with tf.control_dependencies(update_ops):
        optimizer = tf.train.AdamOptimizer(
            learning_rate=0.0001,
        )
        goal = optimizer.minimize(diff)
else:
    tf.contrib.quantize.create_eval_graph()

init = tf.global_variables_initializer()
with tf.Session() as session:
    session.run(init)

    saver = tf.train.Saver()
    try:
        saver.restore(session, CKPT)
    except BaseException as e:
        print("While trying to restore: {}".format(str(e)))

    if "train" == cmd:
        for epoch in range(2):
            _, d = session.run([goal, diff], feed_dict={
                nn_in: tr_in,
                nn_act: tr_out,
            })
            print("Loss: {}".format(d))
        saver.save(session, CKPT)
    elif "save" == cmd:
        converter = tf.lite.TFLiteConverter.from_session(
            session, [nn_in], [nn_out],
        )
        converter.inference_type = tf.lite.constants.QUANTIZED_UINT8
        input_arrays = converter.get_input_arrays()
        converter.quantized_input_stats = {input_arrays[0] : (0.0, 1.0)}
        tflite_model = converter.convert()
        with open(TFLITE, "wb") as f:
            f.write(tflite_model)

Assuming you have a directory called "a", this can be ran with:

python example.py train
python example.py save

The "train" step should work fine, but when attempting to export the quantized tflite file, I get the following:

...
2019-05-14 14:03:44.032912: F tensorflow/lite/toco/graph_transformations/quantize.cc:144] Array output does not have MinMax information, and is not a constant array. Cannot proceed with quantization.
Aborted

My goal is to successfully run the "save" step and end up with a trained quantized model. What am I missing?

Pavel Konovalov · Accepted Answer

There is a tricky bug in TFLiteConverter:

For the conversion to the quantized model format in requires additional nodes (with MinMax info) for each (almost each) mathematical operation node.
Such additional nodes are added create_eval_graph function after corresponding operations.
But during conversion to the TFLite format converter only takes into account the nodes between inputs and outputs (inclusively). Therefor additional node (with MinMax info) after your nn_out is "thrown away" in this case, which leads to the mentioned conversion error :(

That bug doesn't appear if you build a classification network which usually ends up with softmax layer (which doesn't require MinMax info). But for the regression networks this is a problem. I use the following workaround.

Add additional (actually meaningless) operation after your output layer before calling the create_eval_graph function, like this:

nn_out = tf.minimum(nn_out, 1e6)

You can use any arbitrary number (for the second argument) just much bigger than expected output layer values upper bound. It works perfectly in my case.

Tensorflow quantization: Array output does not have MinMax information

Answers (1)

Related Questions