Naqi
Naqi

Reputation: 135

RuntimeError: Input, output and indices must be on the current device. (fill_mask("Random text <mask>.")

I am getting "RuntimeError: Input, output and indices must be on the current device." when I run this line. fill_mask("Auto Car .")

I am running it on Colab. My Code:

from transformers import BertTokenizer, BertForMaskedLM
from pathlib import Path
from tokenizers import ByteLevelBPETokenizer
from transformers import BertTokenizer, BertForMaskedLM


paths = [str(x) for x in Path(".").glob("**/*.txt")]
print(paths)

bert_tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

from transformers import BertModel, BertConfig

configuration = BertConfig()
model = BertModel(configuration)
configuration = model.config
print(configuration)

model = BertForMaskedLM.from_pretrained("bert-base-uncased")

from transformers import LineByLineTextDataset
dataset = LineByLineTextDataset(
    tokenizer=bert_tokenizer,
    file_path="./kant.txt",
    block_size=128,
)

from transformers import DataCollatorForLanguageModeling
data_collator = DataCollatorForLanguageModeling(
    tokenizer=bert_tokenizer, mlm=True, mlm_probability=0.15
)

from transformers import Trainer, TrainingArguments

training_args = TrainingArguments(
    output_dir="./KantaiBERT",
    overwrite_output_dir=True,
    num_train_epochs=1,
    per_device_train_batch_size=64,
    save_steps=10_000,
    save_total_limit=2,
    )

trainer = Trainer(
    model=model,
    args=training_args,
    data_collator=data_collator,
    train_dataset=dataset,
)

trainer.train()

from transformers import pipeline

fill_mask = pipeline(
    "fill-mask",
    model=model,
    tokenizer=bert_tokenizer
)

fill_mask("Auto Car <mask>.")

The last line is giving me the error mentioned above. Please let me know what I am doing wrong or what I have to do in order to remove this error.

Upvotes: 0

Views: 2340

Answers (1)

cronoik
cronoik

Reputation: 19385

The trainer trains your model automatically at GPU (default value no_cuda=False). You can verify this by running:

model.device

after training. The pipeline does not this and this leads to the error you see (i.e. your model is on your GPU but your example sentence is on your CPU). You can fix that by either run the pipeline with GPU support as well:

fill_mask = pipeline(
    "fill-mask",
    model=model,
    tokenizer=bert_tokenizer,
    device=0,
)

or by transferring your model to CPU before initializing the pipeline:

model.to('cpu')

Upvotes: 1

Related Questions