Iamspeed Mc
Iamspeed Mc

Reputation: 11

MPS only uses one GPU for training

Only 1/10 GPU cores are being used of my M3 Mac. No clue how to use all of them or at least 8. I don't think MPS allows for more than one GPU core to be used for training. Can someone help me out?

Here is my code:

import torch
from transformers import DistilBertForSequenceClassification, DistilBertTokenizer, Trainer, TrainingArguments
from datasets import Dataset, DatasetDict
import pandas as pd
from sklearn.model_selection import train_test_split

# Load the dataset
df = pd.read_csv('jigsaw-toxic-comment-train-processed-seqlen128.csv')

# Drop unnecessary columns
df = df[['comment_text', 'toxic']]

# Split the data into training and validation sets
train_df, val_df = train_test_split(df, test_size=0.2, random_state=42)

# Convert dataframes to Hugging Face Datasets
train_dataset = Dataset.from_pandas(train_df)
val_dataset = Dataset.from_pandas(val_df)
dataset = DatasetDict({'train': train_dataset, 'validation': val_dataset})

# Load tokenizer
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased')

def preprocess_function(examples):
    return tokenizer(examples['comment_text'], padding='max_length', truncation=True)

# Tokenize the datasets
tokenized_datasets = dataset.map(preprocess_function, batched=True)

# Rename the toxic column to labels for compatibility
tokenized_datasets = tokenized_datasets.rename_column("toxic", "labels")

# Load model
model = DistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased', num_labels=2)

# Move model to device
device = torch.device("mps")
model.to(device)

# Set up training arguments
training_args = TrainingArguments(
    output_dir='./results',
    evaluation_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=3,
    weight_decay=0.01,
)

# Initialize Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_datasets['train'],
    eval_dataset=tokenized_datasets['validation'],
    tokenizer=tokenizer,
)

# Train the model
trainer.train()

I have no clue how to solve this problem, maybe using MLX would help?

Upvotes: 0

Views: 146

Answers (0)

Related Questions