Reputation: 517
SPECS: OS: Windows 10 CUDA: 10.1 GPU: RTX 2060 6G VRAM (x2) RAM: 32GB tutorial: https://huggingface.co/blog/how-to-train
Hello I am trying to train my own language model and I have had some memory issues. I have tried to run some of this code in Pycharm on my computer and then trying to replicate in my Collab Pro Notebook.
from transformers import RobertaConfig, RobertaTokenizerFast, RobertaForMaskedLM, LineByLineTextDataset
from transformers import DataCollatorForLanguageModeling, Trainer, TrainingArguments
config = RobertaConfig(vocab_size=60000, max_position_embeddings=514, num_attention_heads=12, num_hidden_layers=6,
type_vocab_size=1)
tokenizer = RobertaTokenizerFast.from_pretrained("./MODEL DIRECTORY", max_len=512)
model = RobertaForMaskedLM(config=config)
print("making dataset")
dataset = LineByLineTextDataset(tokenizer=tokenizer, file_path="./total_text.txt", block_size=128)
print("making c")
data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=True, mlm_probability=0.15)
training_args = TrainingArguments(output_dir="./MODEL DIRECTORY", overwrite_output_dir=True, num_train_epochs=1,
per_gpu_train_batch_size=64, save_steps=10000, save_total_limit=2)
print("Building trainer")
trainer = Trainer(model=model, args=training_args, data_collator=data_collator, train_dataset=dataset,
prediction_loss_only=True)
trainer.train()
trainer.save_model("./MODEL DIRECTORY")
"./total_text.txt"
being a 1.7GB text file.
This code on pycharm builds the dataset and then would throw an error saying that my preferred gpu was running out of memory, and that Torch was already using 3.7GiB of memory.
I tried:
os.environ["CUDA_VISIBLE_OBJECTS"] =""
so that torch would have to use my CPU and not my GPU. Still threw same gpu memory error...So succumbing to the fact that torch, for the time being, was forcing itself to use my gpu, I decided to go to Collab.
Collab has different issues with my code. It does not have the memory to build the dataset, and crashes due to RAM shortages. I purchased a Pro account and then increased the usable RAM to 25GB, still memory shortages.
Cheers!
Upvotes: 0
Views: 839
Reputation: 517
I came to the conclusion that my text file for training was way to big. From the other examples I found, the training text was around 300MB not 1.7GB. In both instances I was asking PyCharm and Collab to pull off a very resource expensive task.
Upvotes: 0