Reputation: 4820
I’m trying to tune hyper-params with the following code:
def my_hp_space(trial):
return {
"learning_rate": trial.suggest_float("learning_rate", 5e-3, 5e-5),
"arr_gradient_accumulation_steps": trial.suggest_int("num_train_epochs", 8, 16),
"arr_per_device_train_batch_size": trial.suggest_int(2, 4),
}
def get_model(model_name, config):
return AutoModelForSequenceClassification.from_pretrained(model_name, config=config)
def compute_metric(eval_predictions):
metric = load_metric('accuracy')
logits, labels = eval_predictions
predictions = np.argmax(logits, axis=-1)
return metric.compute(predictions=predictions, references=labels)
training_args = TrainingArguments(output_dir='test-trainer',
evaluation_strategy="epoch",
num_train_epochs= 10)
data_collator = default_data_collator
model_name = 'sentence-transformers/nli-roberta-base-v2'
config = AutoConfig.from_pretrained(model_name,num_labels=3)
trainer = Trainer(
model_init = get_model(model_name, config),
args = training_args,
train_dataset = tokenized_datasets['TRAIN'],
eval_dataset = tokenized_datasets['TEST'],
compute_metrics = compute_metric,
tokenizer = None,
data_collator = data_collator,
)
best = trainer.hyperparameter_search(direction="maximize", hp_space=my_hp_space)
And getting error:
1173 model = self.model_init(trial)
1174 else:
-> 1175 raise RuntimeError("model_init should have 0 or 1 argument.")
1177 if model is None:
1178 raise RuntimeError("model_init should not return None.")
RuntimeError: model_init should have 0 or 1 argument.
Upvotes: 1
Views: 882
Reputation: 436
According to the documentation you have to pass the model_init
as a callable.
trainer = Trainer(
model_init = get_model,
args = training_args,
train_dataset = tokenized_datasets['TRAIN'],
eval_dataset = tokenized_datasets['TEST'],
compute_metrics = compute_metric,
tokenizer = None,
data_collator = data_collator,
)
Additionally there seems to be an issue with with the number of defined parameters in your passed model_init
function. Your function get_model
requires two parameters, while only 0 or 1 may be passed. The huggingface documentation states
The function may have zero argument, or a single one containing the optuna/Ray Tune/SigOpt trial object, to be able to choose different architectures according to hyper parameters (such as layer count, sizes of inner layers, dropout probabilities etc).
You can define your parameters inside the get_model
function and it works.
def get_model():
model_name = 'sentence-transformers/nli-roberta-base-v2'
config = AutoConfig.from_pretrained(model_name,num_labels=3)
return AutoModelForSequenceClassification.from_pretrained(model_name, config=config)
The official raytune example contains some code how to keep your parametrisation. They define an additional function tune_transformer
and define get_model
inside the function scope of tune_transformer
. You can check their example, if you want to keep your parametrisation
Hope it helps.
Upvotes: 0