Nazareno De Francesco
Nazareno De Francesco

Reputation: 79

Pytorch NLP Huggingface: model not loaded on GPU

I have this code that init a class with a model and a tokenizer from Huggingface. On Google Colab this code works fine, it loads the model on the GPU memory without problems. On Google Cloud Platform it does not work, it loads the model on gpu, whatever I try.

class OPT:
    def __init__(self, model_name: str = "facebook/opt-2.7b", use_gpu: bool = False):
        self.model_name = model_name
        self.use_gpu = use_gpu and torch.cuda.is_available()
        print(f"Use gpu:: {self.use_gpu}")

        if self.use_gpu:
            print("Using gpu")
            self.model = AutoModelForCausalLM.from_pretrained(
                self.model_name, torch_dtype=torch.float16
            ).cuda()
        else:
            print("Using cpu")
            self.model = AutoModelForCausalLM.from_pretrained(
                self.model_name, torch_dtype=torch.float32, low_cpu_mem_usage=True
            )

        # the fast tokenizer currently does not work correctly
        self.tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False)

The printed output is correct:

Use gpu:: True
Using gpu

But the nvidia-smi says that there is no process running on the gpu:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.82.01    Driver Version: 470.82.01    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            On   | 00000000:00:04.0 Off |                    0 |
| N/A   40C    P8     9W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

And with htop I can see that the process is using the cpu ram.

Upvotes: 5

Views: 8494

Answers (2)

Unicorn
Unicorn

Reputation: 57

You should use the .to(device) method like this:

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = nameofyourmodel.to(device)

Upvotes: 5

Aditya Punetha
Aditya Punetha

Reputation: 46

Like with every PyTorch model, you need to put it on the GPU, as well as your batches of inputs using .to(device) method.

https://discuss.huggingface.co/t/is-transformers-using-gpu-by-default/8500 https://github.com/huggingface/transformers/issues/2704

Upvotes: 0

Related Questions