Reputation: 5767
I want to evaluate some french embeddings models using MTEB Semantic Text Similarity (STS) task. To do this, I took inspiration from this code run_mteb_french.py
import logging
from sentence_transformers import SentenceTransformer
from mteb import MTEB
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("main")
model_name = "dangvantuan/sentence-camembert-large"
model = SentenceTransformer(model_name)
TASK_LIST_STS = [
"SummEvalFr",
"STSBenchmarkMultilingualSTS",
"STS22",
"SICKFr"
]
for task in TASK_LIST_STS:
logger.info(f"Running task: {task}")
evaluation = MTEB(tasks=[task], task_langs=["fr"])
evaluation.run(model_name, output_folder=f"fr_results/{model_name}")
But I got this error:
Summarization
- SummEvalFr, p2p
ERROR:mteb.evaluation.MTEB:Error while evaluating SummEvalFr: 'batch_size' is an invalid keyword argument for encode()
TypeError Traceback (most recent call last)
in <cell line: 17>() 18 logger.info(f"Running task: {task}") 19 evaluation = MTEB(tasks=[task], task_langs=["fr"]) ---> 20 evaluation.run(model_name, output_folder=f"fr_results/{model_name}")
4 frames
/usr/local/lib/python3.10/dist-packages/mteb/evaluation/evaluators/SummarizationEvaluator.py in call(self, model) 51 52 logger.info(f"Encoding {sum(human_lens)} human summaries...") ---> 53 embs_human_summaries_all = model.encode( 54 [summary for human_summaries in self.human_summaries for summary in human_summaries], 55 batch_size=self.batch_size,
TypeError: 'batch_size' is an invalid keyword argument for encode()
What should I do ?
Upvotes: 0
Views: 119
Reputation: 5767
Typo error !
In code, I do evaluation.run(model_name, output_folder=f"fr_results/{model_name}")
. It's rather evaluation.run(model, output_folder=f"fr_results/{model_name}")
Upvotes: 1