Reputation: 51
I am currently using LangChain and OpenAI to build a Natural Language to SQL model. The issue I am having is that I want the model to return "I don't know" or "Please provide more context" when answering a question as vague as "Number of rows". This is the current code I have below:
_DEFAULT_TEMPLATE = “”"Given an input question, first create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer.
Use the following format:
Question: "Question here"
SQLQuery: "SQL Query to run"
SQLResult: "Result of the SQLQuery"
Answer: "Final answer here"
Only use the following tables: SDS_DataModel.Person and SDS_DataModel.Campus
SDS_DataModel.Person contains the following information: first name, last name, GreendaleID, email and net worth.
SDS_DataModel.Campus contains the following information: GPA, Graduation, GreendaleID, their on campus job and if they live on campus.
SDS_DataModel.Campus and SDS_DataModel.Person are joined by GreendaleID.
{table_info}
Never use LIMIT statement, use TOP statement instead.
Format all numeric response ###,###,###,###.
Ask as many clarifying questions as need be before answering.
Only return relevant columns to the question
Question: {input}"""
PROMPT = PromptTemplate(
input_variables=["input", "table_info", "dialect"], template=_DEFAULT_TEMPLATE
)
connection_string = "url"
db = SQLDatabase.from_uri(connection_string)
llm = OpenAI(temperature=0, verbose=True, model="gpt-3.5-turbo-instruct")
database_chain = create_sql_query_chain(llm,db, prompt=PROMPT)
sql_query = database_chain.invoke({"question": x})
I am unsure how to prompt engineer it or get threshold values so that the model doesn't answer the question if it doesnt have enough information or if it isnt confident enough.
Upvotes: 1
Views: 582
Reputation: 4253
I feed the json schema of tables then instruct the llm to create a left join sql and output certain fields. The llm finds the foreign key and then creates the on links
chain_llm=OpenAI(model="gpt-3.5-turbo",temperature=0.7, openai_api_key=key)
declare @Output as nvarchar(max)
select @Output=
(
SELECT
TABLE_NAME AS 'TableName',
COLUMN_NAME AS 'FieldName',
DATA_TYPE AS 'FieldType'
FROM
INFORMATION_SCHEMA.COLUMNS
WHERE
TABLE_SCHEMA = 'dbo'
and TABLE_CATALOG='ReportData'
and TABLE_NAME in ('fPMDrawList','dContract')
FOR JSON AUTO
)
select @Output
json_data=[{"TableName":"dContract","FieldName":"JCCo","FieldType":"tinyint"},{"TableName":"dContract","FieldName":"ContractKeyID","FieldType":"bigint"},{"TableName":"dContract","FieldName":"CoDep","FieldType":"varchar"},{"TableName":"dContract","FieldName":"Contract","FieldType":"varchar"},{"TableName":"dContract","FieldName":"ContractDescription","FieldType":"varchar"},{"TableName":"dContract","FieldName":"PayTerms","FieldType":"varchar"},{"TableName":"dContract","FieldName":"PayTermsDescription","FieldType":"varchar"},{"TableName":"dContract","FieldName":"DepartmentKeyID","FieldType":"bigint"},{"TableName":"dContract","FieldName":"Department","FieldType":"varchar"},{"TableName":"dContract","FieldName":"DepartmentDescription","FieldType":"varchar"},{"TableName":"dContract","FieldName":"StartMonth","FieldType":"smalldatetime"},{"TableName":"dContract","FieldName":"MonthClosed","FieldType":"smalldatetime"},{"TableName":"dContract","FieldName":"ContractAmt","FieldType":"numeric"},{"TableName":"dContract","FieldName":"BilledAmt","FieldType":"numeric"},{"TableName":"dContract","FieldName":"ReceivedAmt","FieldType":"numeric"},{"TableName":"dContract","FieldName":"CurrentRetainAmt","FieldType":"numeric"},{"TableName":"dContract","FieldName":"ContractStatus","FieldType":"tinyint"},{"TableName":"dContract","FieldName":"udContractGroup","FieldType":"varchar"},{"TableName":"dContract","FieldName":"ContractStatusDescription","FieldType":"varchar"},{"TableName":"dContract","FieldName":"PercentBilled","FieldType":"numeric"},{"TableName":"dContract","FieldName":"ActiveJobs","FieldType":"int"},{"TableName":"dContract","FieldName":"ContractKey","FieldType":"bigint"},{"TableName":"fPMDrawList","FieldName":"ContractKeyID","FieldType":"bigint"},{"TableName":"fPMDrawList","FieldName":"JobMasterKeyID","FieldType":"bigint"},{"TableName":"fPMDrawList","FieldName":"JobDepartmentKeyID","FieldType":"bigint"},{"TableName":"fPMDrawList","FieldName":"TotalCalculatedRetention","FieldType":"numeric"},{"TableName":"fPMDrawList","FieldName":"HQCo","FieldType":"tinyint"},{"TableName":"fPMDrawList","FieldName":"Name","FieldType":"varchar"},{"TableName":"fPMDrawList","FieldName":"Department","FieldType":"varchar"},{"TableName":"fPMDrawList","FieldName":"DepartmentDescription","FieldType":"varchar"},{"TableName":"fPMDrawList","FieldName":"Customer","FieldType":"int"},{"TableName":"fPMDrawList","FieldName":"CustomerName","FieldType":"varchar"},{"TableName":"fPMDrawList","FieldName":"ContractStatus","FieldType":"tinyint"},{"TableName":"fPMDrawList","FieldName":"udProjectStatus","FieldType":"varchar"},{"TableName":"fPMDrawList","FieldName":"ProjectManager","FieldType":"varchar"},{"TableName":"fPMDrawList","FieldName":"Contract","FieldType":"varchar"},{"TableName":"fPMDrawList","FieldName":"JCCo","FieldType":"tinyint"},{"TableName":"fPMDrawList","FieldName":"Job","FieldType":"varchar"},{"TableName":"fPMDrawList","FieldName":"ContractDescription","FieldType":"varchar"},{"TableName":"fPMDrawList","FieldName":"ContractAmt","FieldType":"numeric"},{"TableName":"fPMDrawList","FieldName":"UnbilledRevenue","FieldType":"numeric"},{"TableName":"fPMDrawList","FieldName":"UnbilledRetainage","FieldType":"numeric"},{"TableName":"fPMDrawList","FieldName":"BilledToDate","FieldType":"numeric"},{"TableName":"fPMDrawList","FieldName":"NetDue","FieldType":"numeric"},{"TableName":"fPMDrawList","FieldName":"ReceivedAmt","FieldType":"numeric"},{"TableName":"fPMDrawList","FieldName":"AmountDue","FieldType":"numeric"},{"TableName":"fPMDrawList","FieldName":"Application","FieldType":"smallint"},{"TableName":"fPMDrawList","FieldName":"InvDate","FieldType":"date"},{"TableName":"fPMDrawList","FieldName":"InvDue","FieldType":"numeric"},{"TableName":"fPMDrawList","FieldName":"Invoice","FieldType":"varchar"},{"TableName":"fPMDrawList","FieldName":"BillMonth","FieldType":"smalldatetime"},{"TableName":"fPMDrawList","FieldName":"CheckDate","FieldType":"date"},{"TableName":"fPMDrawList","FieldName":"Amount_Paid","FieldType":"numeric"},{"TableName":"fPMDrawList","FieldName":"Retainage","FieldType":"numeric"},{"TableName":"fPMDrawList","FieldName":"JobMasterKey","FieldType":"bigint"},{"TableName":"fPMDrawList","FieldName":"ContractKey","FieldType":"bigint"},{"TableName":"fPMDrawList","FieldName":"JobDepartmentKey","FieldType":"bigint"}]
instructions="Generate sql select for the following fields: ContractDescription, DepartmentDescription, Name, Contract with a left outer join between fPMDrawList and dPMContracts"
prompt_template = PromptTemplate(
template="{instructions} based on the JSON data.\n{json_data}\n",
input_variables=["json_data"]
)
pipline_chain=prompt_template | chain_llm | OutputParser
result = pipline_chain.invoke({"instructions":instructions,"json_data": json_data})
print(result)
Upvotes: 0