Reputation: 15
I am new to the world of pytorch and I used searches and a couple of other source to get rid of the CUDA memory error without luck, perhaps anyone here has a solution.
I have the following code and want to simply run it:
from PIL import Image
import torch
from transformers import AutoProcessor, LlavaForConditionalGeneration
model_id = "llava-hf/llava-1.5-13b-hf"
prompt = "USER: <image>\nWhat are these?\nASSISTANT:"
image_file = "http://images.cocodataset.org/val2017/000000039769.jpg"
model = LlavaForConditionalGeneration.from_pretrained(
model_id,
torch_dtype=torch.float16,
low_cpu_mem_usage=True,
).to(0)
processor = AutoProcessor.from_pretrained(model_id)
raw_image = Image.open(requests.get(image_file, stream=True).raw)
inputs = processor(prompt, raw_image, return_tensors='pt').to(0, torch.float16)
output = model.generate(**inputs, max_new_tokens=200, do_sample=False)
print(processor.decode(output[0][2:], skip_special_tokens=True))
if I start the program I immediately get the standard CUDA out of memory error.
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.146.02 Driver Version: 535.146.02 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4070 Ti Off | 00000000:01:00.0 Off | N/A |
| 30% 57C P0 34W / 285W | 0MiB / 12282MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
Is it possible that the graphic card is really to weak? I could not imagine that as it takes about 20 sec to run the script with CPU? Tried everything with batch sizes clearing cache rebooting. Does anybody know or can point me in the right direction to get the pretrained model running?
Upvotes: 0
Views: 630
Reputation: 1
Did you already try using image from local instead from URL?
You also could try using smaller image sample or smaller tokens.
Upvotes: 0