Hardware requirements for using sentence-transformers/all-MiniLM-L6-v2

Question

Can someone please advise me upon the hardware requirements of using sentence-transformers/all-MiniLM-L6-v2 for a semantic similarity use-case. I had downloaded the model locally and am using it to generate embedding, and finally using util.pytorch_cos_sim to calculate similarity scores between 2 sentences. All was working good in my Mac Pro ( 2.4 GHz 8-Core Intel Core i9 processor and 32 GB memory); but after I moved the model to containers of 1 core CPU and 4 GB RAM (within my company provided network), the code is taking at least 15-20 times more time to generate the cosine similarity score.

Did someone face a similar situation? Kindly advise. Thank you in advance for the help!

N.B.: I am also sharing the sample code for reference.

from sentence_transformers import SentenceTransformer, util
sentences = ["What happens when my account is debited", "What is a debit"]

# Model Instantiation
sent_sim_model = SentenceTransformer('./all-MiniLM-L6-v2')
embedding_0= sent_sim_model.encode(sentences[0], convert_to_tensor=True)
embedding_1 = sent_sim_model.encode(sentences[1], convert_to_tensor=True)

# Calculate cosine sim score:
print(util.pytorch_cos_sim(embedding_0, embedding_1).tolist()[0][0])

I have been running the model successfully in my local system for quite sometime now (after storing it locally in the same directory as that of the code), but once I had moved the model and the above code to a docker container , the response time (which used to be between 2-3 secs in my local system) had gone up to more than 1 minute. Since each container I am using has got a configuration of 1 CPU core and 4 GB RAM, I would like to get inputs on the fact if this low hardware can be the issue for the above code .

Hardware requirements for using sentence-transformers/all-MiniLM-L6-v2

Answers (1)

Related Questions