Reputation: 5365
I have a FastAPI where I use pgvector-python
to store embeddings from an LLM.
Storing data works superfine, but I struggle to figure out, how to retrieve data based on the cosinesimilarity in an efficient way.
Say I have a request of three new embeddings data_new = [[1,2,3],[3,2,1],[1,2,1]]
and I want to get the data from my database, which has the highest similarity with each embedding in data_new
.
I could loop over data_new
and run some raw sql-code e.g
data_new = [[1,2,3],[3,2,1],[1,2,1]]
TOP_DOCS = []
engine = create_engine() #Returns SQLAlchemy Engine
for data in data_new:
query = f"SELECT * FROM embeddings ORDER BY embedding <=> {data} LIMIT 1"
res = engine.execute(query)
TOP_DOCS.append(res.fetchall())
but I doubt that is the most efficient way.
Isn't there a way using pgvector-python
to optimize it, or do we really need to fetch the data like this?
If it matters my FastAPI
model looks like:
from base import Base
from sqlalchemy import mapped_column
from pgvector.sqlalchemy import Vector
class OcrModel(Base):
__tablename__ = "embedding"
user_id= Column(int, primary_key=True, index=True)
embedding = mapped_column(Vector(1536))
Upvotes: 0
Views: 1684