Reputation: 183
I want to predict the sentiment of thousands of sentences using huggingface.
from transformers import pipeline
model_path = "cardiffnlp/twitter-xlm-roberta-base-sentiment"
pipe = pipeline("sentiment-analysis", model=model_path, tokenizer=model_path)
from datasets import load_dataset
data_files = {
"train": "/content/data_customer.csv"
}
dataset = load_dataset("csv", data_files=data_files)
dataset = dataset.map(lambda examples: dict(pipe(examples['text'])))
but I am getting the following error.
RuntimeError: The expanded size of the tensor (585) must match the existing size (514) at non-singleton dimension 1. Target sizes: [1, 585]. Tensor sizes: [1, 514]
This post suggests a way to fix the issue but doesn't say how to fix it in pipeline. The size of tensor a (707) must match the size of tensor b (512) at non-singleton dimension 1
Upvotes: 4
Views: 15294
Reputation: 1392
Simply add tokenizer arguments when you init the pipeline.
pipe = pipeline("sentiment-analysis", model=model_path, tokenizer=model_path, max_length=512, truncation=True)
Upvotes: 24