raceee
raceee

Reputation: 517

Memory Issue while following LM tutorial

SPECS: OS: Windows 10 CUDA: 10.1 GPU: RTX 2060 6G VRAM (x2) RAM: 32GB tutorial: https://huggingface.co/blog/how-to-train

Hello I am trying to train my own language model and I have had some memory issues. I have tried to run some of this code in Pycharm on my computer and then trying to replicate in my Collab Pro Notebook.

First, my code

from transformers import RobertaConfig, RobertaTokenizerFast, RobertaForMaskedLM, LineByLineTextDataset
from transformers import DataCollatorForLanguageModeling, Trainer, TrainingArguments

config = RobertaConfig(vocab_size=60000, max_position_embeddings=514, num_attention_heads=12, num_hidden_layers=6,
                       type_vocab_size=1)

tokenizer = RobertaTokenizerFast.from_pretrained("./MODEL DIRECTORY", max_len=512)

model = RobertaForMaskedLM(config=config)

print("making dataset")

dataset = LineByLineTextDataset(tokenizer=tokenizer, file_path="./total_text.txt", block_size=128)

print("making c")

data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=True, mlm_probability=0.15)

training_args = TrainingArguments(output_dir="./MODEL DIRECTORY", overwrite_output_dir=True, num_train_epochs=1,
                                  per_gpu_train_batch_size=64, save_steps=10000, save_total_limit=2)
print("Building trainer")
trainer = Trainer(model=model, args=training_args, data_collator=data_collator, train_dataset=dataset,
                  prediction_loss_only=True)
trainer.train()

trainer.save_model("./MODEL DIRECTORY")

"./total_text.txt" being a 1.7GB text file.

PyCharm Attempt

This code on pycharm builds the dataset and then would throw an error saying that my preferred gpu was running out of memory, and that Torch was already using 3.7GiB of memory.

I tried:

So succumbing to the fact that torch, for the time being, was forcing itself to use my gpu, I decided to go to Collab.

Collab Attempt

Collab has different issues with my code. It does not have the memory to build the dataset, and crashes due to RAM shortages. I purchased a Pro account and then increased the usable RAM to 25GB, still memory shortages.

Cheers!

Upvotes: 0

Views: 839

Answers (1)

raceee
raceee

Reputation: 517

I came to the conclusion that my text file for training was way to big. From the other examples I found, the training text was around 300MB not 1.7GB. In both instances I was asking PyCharm and Collab to pull off a very resource expensive task.

Upvotes: 0

Related Questions