pytorch eager quantization - skipping modules

Question

I am using eager mode quantization. However, I want to skip some layers from being quantized. I am following the tutorial here

However, when I test the model now I get the following error:

Could not run ‘aten::_slow_conv2d_forward’ with arguments from the ‘QuantizedCPU’ backend.

If I understand correctly, this is because the layers with qconfig = none are receiving quantized data while expecting dequantized data. Is there a way I can add instruction to dequantize data before the layer and quantize it after the layer, in my loop? or what other possible workaround might I do for this purpose?

The code to exclude layers:

for quantized_layer, _ in fused_model.named_modules():
   if (quantized_layer in sortedSensitivityDict):
      if sortedSensitivityDict[quantized_layer] > 0.94:
        _.qconfig = torch.quantization.get_default_qconfig("qnnpack")
      else:
        _.qconfig = None

The code to quantize:

import torch.optim as optim
model_fp32_prepared = torch.quantization.prepare(fused_model)

def calibrate(model, data_loader):
    model.eval()
    with torch.no_grad():
        for image, target in data_loader:
            model(image)

calibrate(model_fp32_prepared, val_loader)
model_fp32_prepared.eval()
model_int8 = torch.quantization.convert(model_fp32_prepared)

The main problem is that I am using MobileNetV3 where the forward function is as follows:

def _forward_impl(self, x: Tensor) -> Tensor:
      x = self.features(x)
      x = self.avgpool(x)
      x = torch.flatten(x, 1)
      x = self.classifier(x)

Since the layers are in self.features, I am not sure how to use self.quant and self.dequant

pytorch eager quantization - skipping modules

Answers (1)

Related Questions