StackOverflow Questions for Tag: quantization

arkuzo
arkuzo

Reputation: 41

How to quantize a HF safetensors model and save it to llama.cpp GGUF format with less than q8_0 quantization?

Score: 2

Views: 2021

Answers: 1

Read More
Zylon
Zylon

Reputation: 29

ONNX-Python: Can someone explain the Calibration_Data_Reader requested by the static_quantization-function?

Score: 3

Views: 744

Answers: 1

Read More
Owen Zhang
Owen Zhang

Reputation: 23

Onnxruntime quantization script for MatMulNbits, what is the type after conversion?

Score: 0

Views: 11

Answers: 0

Read More
Hiba Lashari
Hiba Lashari

Reputation: 1

Structured Pruning of Yolov8

Score: 0

Views: 118

Answers: 0

Read More
doniker99
doniker99

Reputation: 31

RuntimeError: "Unused kwargs" and "frozenset object has no attribute discard" with BitsAndBytes bf16 Quantized Model in Hugging Face Gradio App

Score: 3

Views: 407

Answers: 1

Read More
Muchacho
Muchacho

Reputation: 17

Issues with MP3-like Compression: Quantization and File Size

Score: 1

Views: 35

Answers: 0

Read More
gillo04
gillo04

Reputation: 98

Trying to quantize YOLOv11 in tensorflow, is this topology normal?

Score: 0

Views: 190

Answers: 1

Read More
Pavan Pandya
Pavan Pandya

Reputation: 1

Kernel Dies When Testing a Quantized ResNet101 Model in PyTorch

Score: 0

Views: 15

Answers: 0

Read More
Ben Sullivan
Ben Sullivan

Reputation: 1

Unit testing PNG quantization by Sharp in Jest

Score: 0

Views: 18

Answers: 0

Read More
Ravi Sankar Guntur
Ravi Sankar Guntur

Reputation: 1

Reference code to convert microsoft/Multilingual-MiniLM-L12-H384 to int8 quantization

Score: 0

Views: 27

Answers: 0

Read More
Florida Man
Florida Man

Reputation: 2157

When using generator for representative dataset in quantization it "Failed to convert value into readable tensor"

Score: 0

Views: 2181

Answers: 2

Read More
Lijin Durairaj
Lijin Durairaj

Reputation: 5240

While trying to implement QLORA using trainer class, getting casting error

Score: 0

Views: 19

Answers: 0

Read More
Ricked
Ricked

Reputation: 11

Transforming a picture into a posterized image with matching grid overlay and symbols

Score: 1

Views: 58

Answers: 0

Read More
Franck Dernoncourt
Franck Dernoncourt

Reputation: 83377

Why are model_q4.onnx and model_q4f16.onnx not 4 times smaller than model.onnx?

Score: -1

Views: 224

Answers: 1

Read More
lauther27
lauther27

Reputation: 23

How to serve a bitsandbytes model with SGLang

Score: 0

Views: 192

Answers: 0

Read More
hafezmg48
hafezmg48

Reputation: 89

pytorch quantized linear function gives shape invalid error

Score: 1

Views: 28

Answers: 0

Read More
meysam
meysam

Reputation: 83

How to Load a 4-bit Quantized VLM Model from Hugging Face with Transformers?

Score: 3

Views: 840

Answers: 1

Read More
Luis Leal
Luis Leal

Reputation: 3524

valueError: Supplied state dict for layers does not contain `bitsandbytes__*` and possibly other `quantized_stats`(when load saved quantized model)

Score: 0

Views: 574

Answers: 1

Read More
user8370684
user8370684

Reputation:

What is the "n" parameter in the JPEG spec's DQT segment?

Score: 1

Views: 374

Answers: 2

Read More
Daniel Salvado
Daniel Salvado

Reputation: 1

Why is my 8-bit quantized model slower than my 16-bit model?

Score: 0

Views: 94

Answers: 0

Read More
PreviousPage 1Next