Enzo
Enzo

Reputation: 1

ORPOTrainer Error: Calculated loss must be on the original device: cuda:0 but device in use is cuda:3

I am trying to train Phi3 with an ORPO dataset using the ORPOTrainer from the HuggingFace Transformers library. My machine has 4 GPUs, so I would like to start multi-GPU training. This is my ORPOCONFIG:

orpo_args = ORPOConfig(
    learning_rate=0.00003,
    beta=0.1,
    lr_scheduler_type="linear",
    max_length=2048,
    max_prompt_length=2048,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=16,
    gradient_accumulation_steps=4,
    optim="paged_adamw_8bit",
    num_train_epochs=3,
    evaluation_strategy="steps",
    eval_steps=200,
    bf16=True,
    logging_steps=1,
    save_steps=500,
    warmup_steps=100,
    report_to="wandb",
    output_dir="./results/",
    remove_unused_columns=False,
    dataset_num_proc=os.cpu_count(),

)

and this is the trainer:

trainer = ORPOTrainer(
    model=model,
    args=orpo_args,
    train_dataset=formatted_orpo_dataset["train"],
    eval_dataset=formatted_orpo_dataset["test"],
    peft_config=peft_config,
    tokenizer=tokenizer,

)

The model was downloaded with 'device' set to 'auto', but I am getting this error here when trainer starts: "Calculated loss must be on the original device: cuda:0 but device in use is cuda:3"

Has anyone else encountered this issue and resolved it?

Thank you.

I tried to start ORPOTrainer but i have this error: "Calculated loss must be on the original device: cuda:0 but device in use is cuda:3".

Upvotes: 0

Views: 387

Answers (1)

David Peer
David Peer

Reputation: 130

I had the same problem and found a solution here. I now set the device_map using the accelerator rather than "auto":

from accelerate import Accelerator
accelerator = Accelerator()
[...]

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    quantization_config=bnb_config,
    device_map = {"": accelerator.process_index},
    attn_implementation=attn_implementation
)

Upvotes: 0

Related Questions