Jakob Hürlemann
Jakob Hürlemann

Reputation: 15

Cuda out of memory when running pretrained model

I am new to the world of pytorch and I used searches and a couple of other source to get rid of the CUDA memory error without luck, perhaps anyone here has a solution.

I have the following code and want to simply run it:

from PIL import Image

import torch
from transformers import AutoProcessor, LlavaForConditionalGeneration

model_id = "llava-hf/llava-1.5-13b-hf"

prompt = "USER: <image>\nWhat are these?\nASSISTANT:"
image_file = "http://images.cocodataset.org/val2017/000000039769.jpg"

model = LlavaForConditionalGeneration.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    low_cpu_mem_usage=True,
).to(0)

processor = AutoProcessor.from_pretrained(model_id)

raw_image = Image.open(requests.get(image_file, stream=True).raw)
inputs = processor(prompt, raw_image, return_tensors='pt').to(0, torch.float16)

output = model.generate(**inputs, max_new_tokens=200, do_sample=False)
print(processor.decode(output[0][2:], skip_special_tokens=True))



if I start the program I immediately get the standard CUDA out of memory error.

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.146.02             Driver Version: 535.146.02   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4070 Ti     Off | 00000000:01:00.0 Off |                  N/A |
| 30%   57C    P0              34W / 285W |      0MiB / 12282MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

Is it possible that the graphic card is really to weak? I could not imagine that as it takes about 20 sec to run the script with CPU? Tried everything with batch sizes clearing cache rebooting. Does anybody know or can point me in the right direction to get the pretrained model running?

Upvotes: 0

Views: 630

Answers (1)

andreanliu
andreanliu

Reputation: 1

Did you already try using image from local instead from URL?

You also could try using smaller image sample or smaller tokens.

Upvotes: 0

Related Questions