Hafsa
Hafsa

Reputation: 26

Issue with adding evaluation dataset with T5Model Training

I am training a T5 model from simpletransformers. I am getting an error in the following line - model.train_model(train, eval_data = eval_data)

The ERROR is as follows -

*/usr/local/lib/python3.10/dist-packages/torch/optim/lr_scheduler.py:139: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
  warnings.warn("Detected call of `lr_scheduler.step()` before `optimizer.step()`. "
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-16-89940dd49e1d> in <cell line: 1>()
----> 1 model.train_model(train, eval_data = eval_data)

4 frames
/usr/local/lib/python3.10/dist-packages/simpletransformers/t5/t5_model.py in train_model(self, train_data, output_dir, show_running_loss, args, eval_data, verbose, **kwargs)
    227         os.makedirs(output_dir, exist_ok=True)
    228 
--> 229         global_step, training_details = self.train(
    230             train_dataset,
    231             output_dir,

/usr/local/lib/python3.10/dist-packages/simpletransformers/t5/t5_model.py in train(self, train_dataset, output_dir, show_running_loss, eval_data, verbose, **kwargs)
    756 
    757             if args.evaluate_during_training and args.evaluate_each_epoch:
--> 758                 results = self.eval_model(
    759                     eval_data,
    760                     verbose=verbose and args.evaluate_during_training_verbose,

/usr/local/lib/python3.10/dist-packages/simpletransformers/t5/t5_model.py in eval_model(self, eval_data, output_dir, verbose, silent, **kwargs)
    912         self._move_model_to_device()
    913 
--> 914         eval_dataset = self.load_and_cache_examples(
    915             eval_data, evaluate=True, verbose=verbose, silent=silent
    916         )

/usr/local/lib/python3.10/dist-packages/simpletransformers/t5/t5_model.py in load_and_cache_examples(self, data, evaluate, no_cache, verbose, silent)
   1181             return CustomDataset(tokenizer, args, data, mode)
   1182         else:
-> 1183             return T5Dataset(
   1184                 tokenizer,
   1185                 self.args,

/usr/local/lib/python3.10/dist-packages/simpletransformers/t5/t5_utils.py in __init__(self, tokenizer, args, data, mode)
    161                 (prefix, input_text, target_text, tokenizer, args)
    162                 for prefix, input_text, target_text in zip(
--> 163                     data["prefix"], data["input_text"], data["target_text"]
    164                 )
    165             ]

TypeError: list indices must be integers or slices, not str*

Please note that both the train, and evaluation datasets have a prefix, input_text, and target_text columns. No idea where TypeError: list indices must be integers or slices, not str is coming from

I am using Google Colab to train. I tried opening the internal t5_model.py file, adding print statements, and changing here and there to get a hang of what the issue is, but I don't think my changes are getting reflected because it is stuck to throwing an error at line 929. The same code was working a few days back.

Upvotes: 1

Views: 64

Answers (0)

Related Questions