Reputation: 299
If I load mistralai/Mistral-7B-v0.1
and try to count its parameters looping over model.parameters
I get ~3.7B
parameters, but I was obviously expecting ~7B
.
model.get_memory_footprint()=4.55GB
looking reasonable for the 7B params in 4bits?from transformers import AutoModelForCausalLM, BitsAndBytesConfig
base_model_id = "mistralai/Mistral-7B-v0.1"
# Create quantization config
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
)
# Load model with quantization
model = AutoModelForCausalLM.from_pretrained(
base_model_id, quantization_config=bnb_config
)
# Count params
def print_parameters(model):
all_param = 0
for param in model.parameters():
all_param += param.numel()
print(
f"all params: {all_param} ||"
)
print_parameters(model)
>>> 3752071168
print(model.num_parameters())
>>> 7241732096
Libraries:
transformers==4.36.1
torch==2.0.1
bitsandbytes==0.41.3.post2
Upvotes: 3
Views: 562
Reputation: 1
It looks like most of the layers only lead to half of the expected parameters because they are 4 bit, e.g. (q_proj): Linear4bit(in_features=4096, out_features=4096, bias=False) only has 4096*4096/2=8388608 parameters when looping over model.parameters().
Upvotes: 0