Using KissFFT to create features for tflite-micro audio classification

Question

I am trying to run audio classification using tflite-micro on ESP32, with fixed point calculations.

The model is create using keras, then it is converted to tflite and quantized to uint8. Cross validation between the keras model and tflite yields good results on python.

My code structure is audio capture(int16) --> create spectrogram using stft (int16->uint16) --> perform quantization(uint16->uint8) --> do inference.

The stft is created on esp32 using KissFFT, using 16 bit int as input.

My problem is that I can't find the way how to do the scaling and quantization for the fixed point to match the values that are input to the model in python for the same audio file.

The fft for a window looks appropriate(see the attached picture).

I tried different formulas for adjusting the output value from KissFFT to be in the same range as the one from Octave / Python but could not find any that makes the inference output equivalent.

Any thoughts on how should I scale the data?

Using KissFFT to create features for tflite-micro audio classification

Answers (1)

Related Questions