Langchain: Is there a way to compare multiple predictions in the string pair evaluator?

Question

from langchain_openai import ChatOpenAI 
from langchain.evaluation import load_evaluator 
lm = ChatOpenAI(base_url="http://localhost:1234/v1", api_key="")
evaluator = load_evaluator("labeled_pairwise_string", llm=llm)

with open('test_cases.json', 'r') as file:
    test_cases = json.load(file)

with open('prediction_ollama.json', 'r') as file:
    predictions = json.load(file)

with open('prediction_gemma.json', 'r') as file:
    predictions_b = json.load(file)

with open('prediction_mistral.json', 'r') as file:
    predictions_c = json.load(file)
results = []
for i, test_case in enumerate(test_cases):
    result = evaluator.evaluate_string_pairs(
        input=test_case["input"],
        prediction=predictions[i]["prediction"],
        prediction_b=predictions_b[i]["prediction_b"]
    )
    results.append((f"
Test Case {i+1}", result))


for test_name, result in results:
    print(test_name, "->", result)

I am currently trying to compare multiple predictions from different LLMs to evaluate which gives the best answer for my use case. Is there a way to compare more than two predictions? As I understand it, the evaluator.evaluate_string_pairs function can only compare two strings at a time, so I'm not sure how I would accomplish this task. Any advice would be appreciated.

Langchain: Is there a way to compare multiple predictions in the string pair evaluator?

Answers (0)

Related Questions