user3668129
user3668129

Reputation: 4820

Hugging face: RuntimeError: model_init should have 0 or 1 argument

I’m trying to tune hyper-params with the following code:

def my_hp_space(trial):
    return {
        "learning_rate": trial.suggest_float("learning_rate", 5e-3, 5e-5),
        "arr_gradient_accumulation_steps": trial.suggest_int("num_train_epochs", 8, 16),
        "arr_per_device_train_batch_size": trial.suggest_int(2, 4),        
    }


def get_model(model_name, config):
    return AutoModelForSequenceClassification.from_pretrained(model_name, config=config)

def compute_metric(eval_predictions):
    
    metric         = load_metric('accuracy')    
    logits, labels = eval_predictions
    predictions    = np.argmax(logits, axis=-1)
    return metric.compute(predictions=predictions, references=labels)

training_args   = TrainingArguments(output_dir='test-trainer', 
                                    evaluation_strategy="epoch",
                                    num_train_epochs= 10)
data_collator   = default_data_collator
model_name      = 'sentence-transformers/nli-roberta-base-v2'
config = AutoConfig.from_pretrained(model_name,num_labels=3)

trainer = Trainer(
    model_init      = get_model(model_name, config),
    args            = training_args,
    train_dataset   = tokenized_datasets['TRAIN'],
    eval_dataset    = tokenized_datasets['TEST'],    
    compute_metrics = compute_metric,
    tokenizer       = None,
    data_collator   = data_collator,
)

best = trainer.hyperparameter_search(direction="maximize", hp_space=my_hp_space)

And getting error:

 1173     model = self.model_init(trial)
   1174 else:
-> 1175     raise RuntimeError("model_init should have 0 or 1 argument.")
   1177 if model is None:
   1178     raise RuntimeError("model_init should not return None.")

RuntimeError: model_init should have 0 or 1 argument.
  1. What am I doing wrong ?
  2. How can I fix it and run hyper parameter method and get best model parameters ?

Upvotes: 1

Views: 882

Answers (1)

rbi
rbi

Reputation: 436

According to the documentation you have to pass the model_init as a callable.

trainer = Trainer(
    model_init      = get_model,
    args            = training_args,
    train_dataset   = tokenized_datasets['TRAIN'],
    eval_dataset    = tokenized_datasets['TEST'],    
    compute_metrics = compute_metric,
    tokenizer       = None,
    data_collator   = data_collator,
)

Additionally there seems to be an issue with with the number of defined parameters in your passed model_init function. Your function get_model requires two parameters, while only 0 or 1 may be passed. The huggingface documentation states

The function may have zero argument, or a single one containing the optuna/Ray Tune/SigOpt trial object, to be able to choose different architectures according to hyper parameters (such as layer count, sizes of inner layers, dropout probabilities etc).

You can define your parameters inside the get_model function and it works.

def get_model():
    model_name = 'sentence-transformers/nli-roberta-base-v2'
    config = AutoConfig.from_pretrained(model_name,num_labels=3)
    return AutoModelForSequenceClassification.from_pretrained(model_name, config=config)

The official raytune example contains some code how to keep your parametrisation. They define an additional function tune_transformerand define get_model inside the function scope of tune_transformer. You can check their example, if you want to keep your parametrisation

Hope it helps.

Upvotes: 0

Related Questions