How to fuse 4-bit LLAMA weights with LoRA ones into one .pt file?

Question

I followed this manual and got llama-7b-hf-int4 (got llama-7b-4bit.pt ) and samwit/alpaca7B-lora (got adapter_model.bin). Now I want to merge them into a single .pt 4bit model. How to do such a thing?

Why I need this:

current lama.cpp supports only legacy 4-bit single file models.
4-bit fine-tuners generate small alpaca fine-tuned mini models.
only 4-bit alpaca tuning is available for my current setup; thus, I need to know how to apply/merge one into another.

How to fuse 4-bit LLAMA weights with LoRA ones into one .pt file?

Answers (0)

Related Questions