Reputation: 182
I want to understand underlying implementation. I know it uses NLP. But how it is determining whether requested thing is table or column. Maybe they are using spacy but customised a bit to understand database terms.
What does it store in memory? Obviously they are not storing whole database. From this answer,i got to know they are storing DDL of Database.
But huge database will probably have large ddl. Won't that create issue?
Upvotes: 0
Views: 1074
Reputation: 49571
But how it is determining whether requested thing is table or column.
if you give this query "How many users are there in my database" to the chatgpt, it will assume that there is a Users
table and it will search through User
table. If you have User table, chatgpt will give you the correct answer. but if you ask this "How many different zip codes are there in shipping addresses". Maybe your table name is Address
but chatgpt will assume that there is a Shipping Address
table but it does not exist. so you will get an error something like "no column exist". you have to write code to handle this error. maybe you sent this error message to chatgpt
and it will try again with a different approach.
You have to be more informative. chatgpt has to know what tables exist or not. You should send the available tables to the chatgpt
Upvotes: 0
Reputation: 2856
This is the implementation for SQLDatabaseChain
https://github.com/langchain-ai/langchain/blob/master/libs/experimental/langchain_experimental/sql/base.py
Regarding your queries
What does it store in memory? Obviously they are not storing whole database.
Answer : Yes SQLDatabaseChain does not store entire database, it works based on metadata
From this answer,i got to know they are storing DDL of Database. But huge database will mostly have large ddl. Won't that create issue?
Answer : Metadata mostly includes table names, column names, primary and foreign keys, all these information together sums up to very small compared to DDL.
Upvotes: 1