Reputation: 21
I would like to make a custom quantizer (not stardard 8 bit) in TensorFlow.
I've gone through the code in tensorflow\tensorflow\contrib\quantize\python
and can see how the nodes are added, but I would like to modify how the tf.fake_quantize_with_min_max_vars
function calculates that outputs.
I cannot seem to find the code that actually does the 32 bit accumulate and downsampling to 8 bit. Can anyone point me to where this code resides?
Upvotes: 1
Views: 161
Reputation: 2878
The code that does the actual quantization of the values is in C++, in this function here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/fake_quant_ops_functor.h#L79
It's not particularly easy to modify, since you'll need to rebuild TensorFlow to get the changes, but hopefully that gives you a start.
Upvotes: 1