Reputation: 428
I have followed the basic example as given below, from: https://huggingface.co/transformers/training.html
from transformers import TFBertForSequenceClassification, TFTrainer, TFTrainingArguments
model = TFBertForSequenceClassification.from_pretrained("bert-large-uncased")
training_args = TFTrainingArguments(
output_dir='./results', # output directory
num_train_epochs=3, # total # of training epochs
per_device_train_batch_size=16, # batch size per device during training
per_device_eval_batch_size=64, # batch size for evaluation
warmup_steps=500, # number of warmup steps for learning rate scheduler
weight_decay=0.01, # strength of weight decay
logging_dir='./logs', # directory for storing logs
)
trainer = TFTrainer(
model=model, # the instantiated π€ Transformers model to be trained
args=training_args, # training arguments, defined above
train_dataset=tfds_train_dataset, # tensorflow_datasets training dataset
eval_dataset=tfds_test_dataset # tensorflow_datasets evaluation dataset
)
trainer.train()
But there seems to be no way to specify the loss function for the classifier. For-ex if I finetune on a binary classification problem, I would use
tf.keras.losses.BinaryCrossentropy(from_logits=True)
else I would use
tf.keras.losses.CategoricalCrossentropy(from_logits=True)
My set up is as follows:
transformers==4.3.2
tensorflow==2.3.1
python==3.6.12
Upvotes: 6
Views: 6357
Reputation: 1523
Trainer
has this capability to use compute_loss
For more you can look into the documentation:
https://huggingface.co/docs/transformers/main_classes/trainer#:~:text=passed%20at%20init.-,compute_loss,-%2D%20Computes%20the%20loss
Here is an example of how to customize Trainer to use a weighted loss (useful when you have an unbalanced training set):
from torch import nn
from transformers import Trainer
class CustomTrainer(Trainer):
def compute_loss(self, model, inputs, return_outputs=False):
labels = inputs.get("labels")
# forward pass
outputs = model(**inputs)
logits = outputs.get("logits")
# compute custom loss (suppose one has 3 labels with different weights)
loss_fct = nn.CrossEntropyLoss(weight=torch.tensor([1.0, 2.0, 3.0]))
loss = loss_fct(logits.view(-1, self.model.config.num_labels), labels.view(-1))
return (loss, outputs) if return_outputs else loss
Upvotes: 5
Reputation: 307
create a class which inherits from PreTrainedModel and then in it's forward function create your respective loss function.
Upvotes: 3