Reputation: 173
I am trying to use Langchain to Extract Structured Output from Unstructured Texts with LLM Tool-Calling.
I have a code that works:
import os
from pydantic import BaseModel, Field
from langchain_openai import ChatOpenAI
model = ChatOpenAI(model="gpt-4o-mini-2024-07-18", temperature=0.0)
class A(BaseModel):
a_1: str
a_2: str
r: str
class B(BaseModel):
a: str
b_1: str
b_2: str
r: str
class C(BaseModel):
ccc:List[A]
ppp: List[B]
structured_llm = model.with_structured_output(C)
response = structured_llm.invoke(prompt)
I want to get "a" as a key in ppp , but code (using Dict) below fails:
import os
from pydantic import BaseModel, Field
from langchain_openai import ChatOpenAI
model = ChatOpenAI(model="gpt-4o-mini-2024-07-18", temperature=0.0)
class A(BaseModel):
a_1: str
a_2: str
r: str
class B(BaseModel):
b_1: str
b_2: str
r: str
class C(BaseModel):
ccc:List[A]
ppp: Dict[str, List[B]]
structured_llm = model.with_structured_output(C)
response = structured_llm.invoke(prompt)
Error :
ValidationError: 1 validation error for C
ppp
Field required [type=missing, input_value={'ccc': [{'a_1': 'Price',...tant to Battery Life'}]}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.9/v/missing
Any clue how to format it as a Dict?
Upvotes: 2
Views: 234
Reputation: 2018
I had the exact same error message when trying to do the same. My first idea was to write the Dict[str, str]
as List[Tuple[str, str]]
. This yielded a similar issue though.
What ended up working for me, was to create another Model with two attributes acting as key and value pair and having a list of that:
from typing import Generic, TypeVar
from pydantic import BaseModel
# Parameterized Key-Value-Pair Model
TKey = TypeVar("TKey")
TValue = TypeVar("TValue")
class KeyValuePair(BaseModel, Generic[TKey, TValue]):
key: TKey
value: TValue
class C(BaseModel):
ccc: List[A]
ppp: List[KeyValuePair[str, List[B]]]
Let me know, if that works for you.
Upvotes: 1