Finetuning Open LLMs

Question

I am a newbie trying to learn fine tuning. Started with falcon 7B instruct LLM as my base LLM and want to fine tune this with open assistant instruct dataset. I have 2080 Ti with 11G VRAM. So I am using 4 bit quantization and Lora.

These are the experiments I did so far:

1> I trained with SFT trainer from hugging face for 25000 epochs, the loss decreased from 1.8 to 0.7. Below is the entire code I am using for training.

import torch, einops
from datasets import load_dataset
from peft import LoraConfig
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    AutoTokenizer,
    TrainingArguments
)
from peft.tuners.lora import LoraLayer

from trl import SFTTrainer


def create_and_prepare_model():
    compute_dtype = getattr(torch, "float16")

    bnb_config = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_compute_dtype=compute_dtype,
        bnb_4bit_use_double_quant=True,
    )

    model = AutoModelForCausalLM.from_pretrained(
        "tiiuae/falcon-7b-instruct", quantization_config=bnb_config, device_map={"": 0}, trust_remote_code=True
    )

    peft_config = LoraConfig(
        lora_alpha=16,
        lora_dropout=0.1,
        r=64,
        bias="none",
        task_type="CAUSAL_LM",
        target_modules=[
            "query_key_value"
        ],
    )

    tokenizer = AutoTokenizer.from_pretrained("tiiuae/falcon-7b-instruct", trust_remote_code=True)
    tokenizer.pad_token = tokenizer.eos_token

    return model, peft_config, tokenizer


training_arguments = TrainingArguments(
    output_dir="./results_falcon-7b-instruct-new",
    per_device_train_batch_size=1,
    gradient_accumulation_steps=10,
    optim="paged_adamw_32bit",
    save_steps=5,
    logging_steps=10,
    learning_rate=2e-4,
    fp16=True,
    max_grad_norm=0.3,
    max_steps=20,
    warmup_ratio=0.03,
    # group_by_length=True,
    lr_scheduler_type="constant",
)

model, peft_config, tokenizer = create_and_prepare_model()
model.config.use_cache = False
dataset = load_dataset("timdettmers/openassistant-guanaco", split="train")

trainer = SFTTrainer(
    model=model,
    train_dataset=dataset,
    peft_config=peft_config,
    dataset_text_field="text",
    max_seq_length=512,
    tokenizer=tokenizer,
    args=training_arguments,
    packing=True,
)

trainer.train()
trainer.save_model("falcon-instruct-7b-4bit-openassist-latest-new")
model.config.to_json_file("falcon-instruct-7b-4bit-openassist-latest-new/config.json")

took about 53 hours. But the model just spits out gibberish when asked for a simple question like "how are you?"

2> 300 epochs, loss went down from 1.8 to 1.5 but the model still spits out gibberish.

3> 40 epochs, loss went down from 1.8 to 1.7 but the model still spits out gibberish.

Any pointers that could give me a head start? Please suggest. Any open source code to do something similar will be greatly appreciated. Thanks a lot.

Finetuning Open LLMs

Answers (1)

Related Questions