Reputation: 77

Release memory for Pycuda

How do I release memory after a Pycuda function call?

For example in below, how do I release memory used by a_gpu so then I will have enough memory to be assigned to b_gpu instead of having the error as below?

I tried importing from pycuda.tools import PooledDeviceAllocation or import pycuda.tools.PooledDeviceAllocation hoping to use the free() function but they both result in error when importing ImportError: cannot import name 'PooledDeviceAllocation' from 'pycuda.tools' (D:\ProgramData\Anaconda3\lib\site-packages\pycuda\tools.py) and ModuleNotFoundError: No module named 'pycuda.tools.PooledDeviceAllocation'; 'pycuda.tools' is not a package. If it should work on newer version of Pycuda, but just my version of Pycuda is too old, is there any other way to release memory in my version or older version of Pycuda? I hope the upgrade of Pycuda to be the last resort as my NVidia card is as old as 2060 series and in case the new version of Pycuda does not support my old card.

Thanks a lot in advance.

import pycuda.driver as cuda
import pycuda.autoinit
from pycuda.compiler import SourceModule
import os

_path = r"D:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\bin\Hostx64\x64"

if os.system("cl.exe"):
   os.environ['PATH'] += ';' + _path
if os.system("cl.exe"):
   raise RuntimeError("cl.exe still not found, path probably incorrect")

import numpy as np

a = np.zeros(1000000000).astype(np.float32)
a_gpu = cuda.mem_alloc(a.nbytes)
cuda.memcpy_htod(a_gpu, a)

mod = SourceModule("""
  __global__ void func1(float *a)
  {
    a[0] = 1;
  }
  """)
      
func = mod.get_function("func1")
func(a_gpu, block=(1,1,1))

a_out = np.empty_like(a)
cuda.memcpy_dtoh(a_out, a_gpu)
print (a_out)

# Memory release code wanted here

b = np.zeros(1000000000).astype(np.float32)
b_gpu = cuda.mem_alloc(b.nbytes)
cuda.memcpy_htod(b_gpu, b)

mod = SourceModule("""
  __global__ void func2(float *b)
  {
    b[1] = 1;
  }
  """)
      
func = mod.get_function("func2")
func(b_gpu, block=(1,1,1))

b_out = np.empty_like(b)
cuda.memcpy_dtoh(b_out, b_gpu)
print (b_out)

[1. 0. 0. ... 0. 0. 0.]
Traceback (most recent call last):

  File "D:\PythonProjects\Test\CUDA\Test_PyCUDA_MemoryRelease.py", line 47, in <module>
    b_gpu = cuda.mem_alloc(b.nbytes)

MemoryError: cuMemAlloc failed: out of memory

Upvotes: 2

Answers (2)

Ed Behn

Reputation: 460

The device memory should be freed be deleting the device array.

del a_gpu

Upvotes: 0

user2314737

Reputation: 29407

Try with free() applied to the DeviceAllocation object (in this case a_gpu)

import pycuda.driver as cuda

a = np.zeros(1000000000).astype(np.float32)
a_gpu = cuda.mem_alloc(a.nbytes)
a_gpu.free()

From the documentation:

free() Release the held device memory now instead of when this object becomes unreachable. Any further use of the object is an error and will lead to undefined behavior.

Check:

cuda.mem_get_info()

Upvotes: 2

Release memory for Pycuda

Answers (2)

Related Questions