StackOverflow Questions for Tag: quantization

arkuzo

Reputation: 41

How to quantize a HF safetensors model and save it to llama.cpp GGUF format with less than q8_0 quantization?

large-language-modelhuggingfacequantizationllamacpp

Score: 2

Answers: 1

Zylon

Reputation: 29

ONNX-Python: Can someone explain the Calibration_Data_Reader requested by the static_quantization-function?

pythononnxquantizationonnxruntimestatic-quantization

Score: 3

Answers: 1

Owen Zhang

Reputation: 23

Onnxruntime quantization script for MatMulNbits, what is the type after conversion?

onnxquantizationonnxruntime

Score: 0

Answers: 0

Hiba Lashari

Reputation: 1

Structured Pruning of Yolov8

optimizationparametersquantizationyolov8pruning

Score: 0

Answers: 0

doniker99

Reputation: 31

RuntimeError: "Unused kwargs" and "frozenset object has no attribute discard" with BitsAndBytes bf16 Quantized Model in Hugging Face Gradio App

huggingface-transformersquantizationgradiobfloat16gemma

Score: 3

Answers: 1

Muchacho

Reputation: 17

Issues with MP3-like Compression: Quantization and File Size

pythonaudiocompressionsignal-processingquantization

Score: 1

Answers: 0

gillo04

Reputation: 98

Trying to quantize YOLOv11 in tensorflow, is this topology normal?

tensorflowesp32yoloquantization

Score: 0

Answers: 1

Pavan Pandya

Reputation: 1

Kernel Dies When Testing a Quantized ResNet101 Model in PyTorch

deep-learningpytorchquantizationquantization-aware-training

Score: 0

Answers: 0

Ben Sullivan

Reputation: 1

Unit testing PNG quantization by Sharp in Jest

unit-testingjestjsquantizationsharp

Score: 0

Answers: 0

Ravi Sankar Guntur

Reputation: 1

Reference code to convert microsoft/Multilingual-MiniLM-L12-H384 to int8 quantization

tensorflowquantizationtflite

Score: 0

Answers: 0

Florida Man

Reputation: 2157

When using generator for representative dataset in quantization it "Failed to convert value into readable tensor"

pythontensorflowgeneratortf.kerasquantization

Score: 0

Answers: 2

Lijin Durairaj

Reputation: 5240

While trying to implement QLORA using trainer class, getting casting error

huggingface-transformersquantizationqlora

Score: 0

Answers: 0

Ricked

Reputation: 11

Transforming a picture into a posterized image with matching grid overlay and symbols

pythonnumpypython-imaging-libraryrgbaquantization

Score: 1

Answers: 0

Franck Dernoncourt

Reputation: 83377

Why are model_q4.onnx and model_q4f16.onnx not 4 times smaller than model.onnx?

deep-learninglarge-language-modelhuggingfaceonnxquantization

Score: -1

Answers: 1

lauther27

Reputation: 23

How to serve a bitsandbytes model with SGLang

huggingface-transformerslarge-language-modelhuggingfacequantizationllama

Score: 0

Answers: 0

hafezmg48

Reputation: 89

pytorch quantized linear function gives shape invalid error

deep-learningpytorchmatrix-multiplicationtransformer-modelquantization

Score: 1

Answers: 0

meysam

Reputation: 83

How to Load a 4-bit Quantized VLM Model from Hugging Face with Transformers?

pythonnlphuggingface-transformershuggingfacequantization

Score: 3

Answers: 1

Luis Leal

Reputation: 3524

valueError: Supplied state dict for layers does not contain `bitsandbytes__*` and possibly other `quantized_stats`(when load saved quantized model)

pythonartificial-intelligencehuggingface-transformerslarge-language-modelquantization

Score: 0

Answers: 1

user8370684

Reputation:

What is the "n" parameter in the JPEG spec's DQT segment?

jpegquantization

Score: 1

Answers: 2

Daniel Salvado

Reputation: 1

Why is my 8-bit quantized model slower than my 16-bit model?

pythontensorflowneural-networktensorflow-litequantization

Score: 0

Answers: 0

PreviousPage 1Next

StackOverflow Questions for Tag: quantization

How to quantize a HF safetensors model and save it to llama.cpp GGUF format with less than q8_0 quantization?

ONNX-Python: Can someone explain the Calibration_Data_Reader requested by the static_quantization-function?

Onnxruntime quantization script for MatMulNbits, what is the type after conversion?

Structured Pruning of Yolov8

RuntimeError: &quot;Unused kwargs&quot; and &quot;frozenset object has no attribute discard&quot; with BitsAndBytes bf16 Quantized Model in Hugging Face Gradio App

Issues with MP3-like Compression: Quantization and File Size

Trying to quantize YOLOv11 in tensorflow, is this topology normal?

Kernel Dies When Testing a Quantized ResNet101 Model in PyTorch

Unit testing PNG quantization by Sharp in Jest

Reference code to convert microsoft/Multilingual-MiniLM-L12-H384 to int8 quantization

When using generator for representative dataset in quantization it &quot;Failed to convert value into readable tensor&quot;

While trying to implement QLORA using trainer class, getting casting error

Transforming a picture into a posterized image with matching grid overlay and symbols

Why are model_q4.onnx and model_q4f16.onnx not 4 times smaller than model.onnx?

How to serve a bitsandbytes model with SGLang

pytorch quantized linear function gives shape invalid error

How to Load a 4-bit Quantized VLM Model from Hugging Face with Transformers?

valueError: Supplied state dict for layers does not contain `bitsandbytes__*` and possibly other `quantized_stats`(when load saved quantized model)

What is the &quot;n&quot; parameter in the JPEG spec&#39;s DQT segment?

Why is my 8-bit quantized model slower than my 16-bit model?

RuntimeError: "Unused kwargs" and "frozenset object has no attribute discard" with BitsAndBytes bf16 Quantized Model in Hugging Face Gradio App

When using generator for representative dataset in quantization it "Failed to convert value into readable tensor"

What is the "n" parameter in the JPEG spec's DQT segment?