Reputation: 31
I would like to use sentence-transformer (https://www.sbert.net/) to encode some English sentences. In order to improve the efficiency, I am trying to run it on 2 T4 GPUs from Jupyter notebook on GCP (Linux Debian python 3.8). (The original question was posted on https://github.com/UKPLab/sentence-transformers/issues/2235 but no response).
from sentence_transformers import SentenceTransformer, LoggingHandler
import logging
logging.basicConfig(format='%(asctime)s - %(message)s',
datefmt='%Y-%m-%d %H:%M:%S',
level=logging.INFO,
handlers=[LoggingHandler()])
sentences = ["This is sentence {}".format(i) for i in range(10)]
#Define the model
model = SentenceTransformer('all-MiniLM-L6-v2', device='cuda')
#Start the multi-process pool on all available CUDA devices
pool = model.start_multi_process_pool(target_devices=['cuda:0', 'cuda:1'])
#Compute the embeddings using the multi-process pool
emb = model.encode_multi_process(sentences, pool). # error - Jupyter kernel restarting
print("Embeddings computed. Shape:", emb.shape, "type: ", type(emb))
print("Embeddings computed:", emb)
Output:
- Load pretrained SentenceTransformer: all-MiniLM-L6-v2
- Start multi-process pool on devices: cuda:0, cuda:1
Then, I got error:
Kernel RestartingThe kernel for my_notebook.ipynb appears to have died. It will restart automatically.
Could anybody let me know if I missed anything ?
============== UPDATE ===========
TypeError Traceback (most recent call last)
Cell In[13], line 15
12 model = SentenceTransformer('all-MiniLM-L6-v2')
14 # Move the model to the first device
---> 15 model = model.to(devices[0])
17 # Wrap the model with DataParallel to utilize multiple GPUs
18 model = torch.nn.DataParallel(model, device_ids=[device.index for device in devices])
File /usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py:1126, in Module.to(self, *args, **kwargs)
1039 def to(self, *args, **kwargs):
1040 r"""Moves and/or casts the parameters and buffers.
1041
1042 This can be called as
(...)
1123
1124 """
-> 1126 device, dtype, non_blocking, convert_to_format = torch._C._nn._parse_to(*args, **kwargs)
1128 if dtype is not None:
1129 if not (dtype.is_floating_point or dtype.is_complex):
TypeError: to() received an invalid combination of arguments - got (device), but expected one of:
* (torch.device device, torch.dtype dtype, bool non_blocking, bool copy, *, torch.memory_format memory_format)
* (torch.dtype dtype, bool non_blocking, bool copy, *, torch.memory_format memory_format)
* (Tensor tensor, bool non_blocking, bool copy, *, torch.memory_format memory_format)
Upvotes: 0
Views: 1347
Reputation: 1
The error you encountered might be due to memory constraints when running the code on multiple GPUs with large models. To address this issue, you can try reducing the batch size or using smaller models. Additionally, you can also try using a limited number of sentences for testing purposes. import torch from torch import cuda from sentence_transformers import SentenceTransformer
# Check the number of available GPUs
num_gpus = torch.cuda.device_count()
# Specify the devices to be used (cuda:0, cuda:1, ...)
devices = [cuda.device(f'cuda:{i}') for i in range(num_gpus)]
# Initialize the model
model = SentenceTransformer('all-MiniLM-L6-v2')
# Move the model to the first device
model = model.to(devices[0])
# Wrap the model with DataParallel to utilize multiple GPUs
model = torch.nn.DataParallel(model, device_ids=[device. Index for device in
devices])
# Encode sentences using the model
sentences = ["This is sentence {}".format(i) for i in range(10)]
with torch.no_grad():
embeddings = []
# Define the batch size
batch_size = 2
# Iterate over the sentences in batches
for i in range(0, len(sentences), batch_size):
# Move the batch of sentences to the first device
input_sentences = [torch.tensor(sentence).to(devices[0]) for sentence in sentences[i:i+batch_size]]
# Encode the batch of sentences using the model
batch_embeddings = model(input_sentences)
# Move the embeddings back to CPU
batch_embeddings = batch_embeddings.cpu()
# Collect the embeddings
embeddings. Append(batch_embeddings)
#Concatenate the embeddings from all devices and batches
embeddings = torch.cat(embeddings, dim=0)
print("Embeddings computed. Shape:", embeddings. Shape, "type:", type(embeddings))
print("Embeddings computed:", embeddings)
Upvotes: 0