Reputation: 117
I am working with different quantized implementations of the same model, the main difference being the precision of the weights, biases, and activations. So I'd like to know how I can find the difference between the size of a model in MBs that's in say 32-bit floating point, and one that's in int8. I have the models saved in .PTH format.
Upvotes: 5
Views: 7190
Reputation: 6658
Have written a small code to calculate the size of your model depending on the number of params your model has & dtype
of your model
Currently supports fp32
,fp16
, bfloat16
& int8
def cal_size(num_params,dtype):
if dtype == "float32":
return (num_params/1024**2) * 4
elif dtype == "float16" or dtype == "bfloat16":
return (num_params/1024**2) * 2
elif dtype == "int8":
return (num_params/1024**2) * 1
else:
return -1
if __name__ == "__main__":
import torchvision.models as models
model = models.mobilenet_v2()
#mobilenetv2 of width multiplier 1 has 3.4M params
total_params = sum(p.numel() for p in model.parameters())
model_size = cal_size(total_params,"float32")
if model_size != -1:
print("Size of model is :{:.2f} MB".format(model_size))
else:
print("Incorrect dtype")
Upvotes: 1
Reputation: 1
An alternative is to just to figure out the size of the folder where the weights were downloaded to your machine (if using something like Hugging Face).
import subprocess
def get_folder_size(model_name):
start_path = '/home/ubuntu/.cache/huggingface/hub'
folder_name = 'models--' + model_name.replace('/', '--')
folder_path = os.path.join(start_path, folder_name)
size = subprocess.check_output(['du', '-sb', folder_path]).split([0].decode('utf-8')
size_in_bytes = int(size)
size_in_gb = round(size_in_bytes / (1024 ** 3), 3) # Convert bytes to GB and round to 3 decimal places
return size_in_gb
get_folder_size('thenlper/gte-base')
Upvotes: 0
Reputation: 851
Following @Prajot's code, one can arrive at a Precision multiplier and multiple parameter multipliers leading to a rule of thumb.
stats for 8 bit models:
---
7 billion param = 7 GB
13 billion param = 13 GB
175 billion param = 175 GB
extending the rule of thumb
---
16 bits? 2x
24 bits? 3x
32 bits? 4x
import numpy as np
import matplotlib.pyplot as plt
def cal_size(precision, num_params_in_billion=10**9):
bits = precision/8
size_in_gb = (num_params_in_billion/1024**3) * bits
return round(size_in_gb,2)
# Assuming num_params as 1 (you can change it accordingly)
precisions = list(range(1, 25))
sizes = [cal_size(precision) for precision in precisions]
plt.plot(precisions, sizes, marker='o')
plt.title('Model Size vs. Precision')
plt.xlabel('Precision (bits)')
plt.ylabel('Size (GB)')
plt.grid(True)
plt.xticks(range(1, 25))
plt.show()
# Calculate the slope using linear regression
precisions_arr = np.array(precisions)
sizes_arr = np.array(sizes)
slope, intercept = np.polyfit(precisions_arr, sizes_arr, 1)
print("Slope of the line:", slope, "intercept:", intercept)
Result: Precision Multiplier = slope of line = 0.1164
if precision = 24 bits, param multiplier = 24*0.1164 = 2.79 ~ 3
if precision = 16 bits, param multiplier = 16*0.1164 = 1.86 ~ 2
if precision = 8 bits, param multiplier = 8*0.1164 = 0.93 ~ 1
model size = Params in billion x param multiplier
7B-24bits = 7 x 3 = 21 GB
13B-24bits = 13 x 3 = 39 GB
7B-16bits = 7 x 2 = 14 GB
7B-8bits = 7 x 1 = 7 GB
Upvotes: 1
Reputation: 351
"To calculate the model size in bytes, one multiplies the number of parameters by the size of the chosen precision in bytes. For example, if we use the bfloat16 version of the BLOOM-176B model, we have 176*10**9 x 2 bytes = 352GB!"
This blog on HF would be worth the read: https://huggingface.co/blog/hf-bitsandbytes-integration
Upvotes: 1
Reputation: 43
You are able to calculate the number of parameters and buffers. Then try to multiply them with the element size and you will have the size of all parameters.
model = models.resnet18()
param_size = 0
buffer_size = 0
for param in model.parameters():
param_size += param.nelement() * param.element_size()
for buffer in model.buffers():
buffer_size += buffer.nelement() * buffer.element_size()
size_all_mb = (param_size + buffer_size) / 1024**2
print('Size: {:.3f} MB'.format(size_all_mb))
And it will print:
Size: 361.209 MB
Upvotes: 2