Julian Gerhard
Julian Gerhard

Reputation: 116

Further finetune a Peft/LoRA finetuned CausalLM Model

I am a bit unsure how to proceed regarding the mentioned topic.

The baseline is a model created via Huggingface’s library as an AutoModelForCausalLM model, PEFT and a LoRA approach with subsequent merging of the weights.

I now want to further fine tune the model without losing its original properties - in this case via instruction fine tuning / prefix tuning.

My approach would be the following:

model = AutoModelForCausalLM.from_pretrained(
        model_id,
        use_cache=False if gradient_checkpointing else True
        device_map="auto",
        load_in_8bit=True,
    )

model = create_peft_config(model)

output_dir = "/tmp"
training_args = TrainingArguments(
        output_dir=output_dir,
        overwrite_output_dir=True,
        per_device_train_batch_size=per_device_train_batch_size,
        per_device_eval_batch_size=per_device_train_batch_size,
        bf16=bf16,
        learning_rate=lr,
        num_train_epochs=epochs,
        gradient_checkpointing=gradient_checkpointing,
        gradient_accumulation_steps=2,
        logging_dir=f"{output_dir}/logs",
        logging_strategy="steps",
        logging_steps=10,
        optim="adafactor",
        save_strategy="epoch",
        save_total_limit=3,
        evaluation_strategy="epoch",
        load_best_model_at_end=False,
        no_cuda=False,
        auto_find_batch_size=True
)

trainer = Trainer(
        model=model,
        args=training_args,
        train_dataset=dataset_train,
        compute_metrics=compute_metrics,
        preprocess_logits_for_metrics=preprocess_logits_for_metrics,
        eval_dataset=dataset_eval,
        data_collator=default_data_collator
)

trainer.train()

trainer.model.save_pretrained(output_dir)

del model
del trainer

peft_config = PeftConfig.from_pretrained(output_dir)
model = AutoModelForCausalLM.from_pretrained(
        peft_config.base_model_name_or_path,
        load_in_8bit=False,
        return_dict=True,
        device_map="auto",
        torch_dtype=torch.float16,
        low_cpu_mem_usage=True,
)
model = PeftModel.from_pretrained(
        model,
        output_dir,
        torch_dtype=torch.float16,
        device_map="auto",
)
model.eval()
os.makedirs("lora", exist_ok=True)

merged_model = model.merge_and_unload()
merged_model.save_pretrained('lora')

tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.save_pretrained('lora')

In principle, I am loading the original model with the merged weights, finetune that on new data likewise with PEFT and LoRA and afterwards merging the weights again into the base model.

Is this a sensible approach, or is there something to suggest, for example, that I might even significantly compromise the original capabilities by doing so? If something speaks against it, what would be a better approach?

Kind regards and thanks in advance

After a training run for 3 epochs on about 15000 instruction pairs, the model is created correctly, the weights are applied and it can be loaded afterwards.

Unfortunately, you can clearly see that the model has lost its original capabilities. Prompts that worked before are dysfunctional. You can see that the model tries to approach the prompts correctly, but not qualitatively.

Upvotes: 5

Views: 7160

Answers (1)

thistleknot
thistleknot

Reputation: 1158

original documentation lacks how to load the model https://huggingface.co/blog/4bit-transformers-bitsandbytes

    trainer.model.save_pretrained('./bits')
    ...
    tokenizer = AutoTokenizer.from_pretrained(model_id)
    #model_id = '/root/.cache/huggingface/hub/models--EleutherAI--gpt-neo-1.3B/snapshots/0f35a20339a9e208bc061570981180cfb87c9572'

    peft_config = PeftConfig.from_pretrained('bits')
    model = AutoModelForCausalLM.from_pretrained(
            peft_config.base_model_name_or_path,
            quantization_config=bnb_config, device_map={"":0}
            #load_in_8bit=False,
            #return_dict=True,
            #device_map="auto",
            #torch_dtype=torch.float16,
            #low_cpu_mem_usage=True,
    )

however, there are some steps here as well if you are using peftmodel https://github.com/huggingface/blog/blob/main/peft.md

    peft_config = LoraConfig(task_type=TaskType.SEQ_2_SEQ_LM, inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1)
    
      model = AutoModelForSeq2SeqLM.from_pretrained(model_name_or_path)
    + model = get_peft_model(model, peft_config)
    + model.print_trainable_parameters()

Upvotes: 1

Related Questions