Supabase pgvector with custom data for query

Question

I am trying to make a database with document embeddings using pgvector, but I would like to know if there is a way to have some field in the collection of documents so that I can categorize the documents, example: if I have multiple companies and each one has different information , and I don't want to mix their data, can't I query the documents for some field and thus only obtain those related to that company? Like space-name in pinecone.

create table documents (
  id bigserial primary key,
  content text,
  embedding vector(1536),
company_id text // Custom id for each company
);


  const { data: documents, error } = await supabaseClient.rpc('match_documents', {
    query_embedding: embedding,
    match_threshold: .73,
    match_count: 10,
    company_id: "" // pass id to func
  })

Inside the match_documents function filter to only get embeddings from that company.

How can I do this?

    create or replace function match_documents (
  query_embedding vector(384),
  match_threshold float,
  match_count int
)
returns table (
  id bigint,
  content text,
  similarity float
)
language sql stable
as $$
  select
    documents.id,
    documents.content,
    1 - (documents.embedding <=> query_embedding) as similarity
  from documents
  where 1 - (documents.embedding <=> query_embedding) > match_threshold and documents.custom_id = custom_id
  order by similarity desc
  limit match_count;
$$;

Supabase pgvector with custom data for query

Answers (1)

Related Questions