Aseem
Aseem

Reputation: 6779

How to release GPU memory & use same buffer for different array in Pyopencl?

Following is my working code for reference:

vector = numpy.array([1, 2, 4, 8], numpy.float32) #cl.array.vec.float4
matrix = numpy.zeros((1, 4), cl.array.vec.float4)
matrix[0, 0] = (1, 2, 4, 8)
matrix[0, 1] = (16, 32, 64, 128)
matrix[0, 2] = (3, 6, 9, 12)
matrix[0, 3] = (5, 10, 15, 25)
# vector[0] = (1, 2, 4, 8)


platform=cl.get_platforms() #gets all platforms that exist on this machine
device=platform[0].get_devices(device_type=cl.device_type.GPU) #gets all GPU's that exist on first platform from platform list
context=cl.Context(devices=[device[0]]) #Creates context for all devices in the list of "device" from above. context.num_devices give number of devices in this context
print("everything good so far")
program=cl.Program(context,"""
__kernel void matrix_dot_vector(__global const float4 * matrix,__global const float *vector,__global float *result)
{
int gid = get_global_id(0);

result[gid]=dot(matrix[gid],vector[0]);
}

""" ).build()
queue=cl.CommandQueue(context)
# queue=cl.CommandQueue(context,cl_device_id device) #Context specific to a device if we plan on using multiple GPUs for parallel processing

mem_flags = cl.mem_flags
matrix_buf = cl.Buffer(context, mem_flags.READ_ONLY | mem_flags.COPY_HOST_PTR, hostbuf=matrix)
vector_buf = cl.Buffer(context, mem_flags.READ_ONLY | mem_flags.COPY_HOST_PTR, hostbuf=vector)
matrix_dot_vector = numpy.zeros(4, numpy.float32)
global_size_of_GPU= 0
destination_buf = cl.Buffer(context, mem_flags.WRITE_ONLY, matrix_dot_vector.nbytes)
# threads_size_buf = cl.Buffer(context, mem_flags.WRITE_ONLY, global_size_of_GPU.nbytes)
program.matrix_dot_vector(queue, matrix_dot_vector.shape, None, matrix_buf, vector_buf, destination_buf)

## Step #11. Move the kernel’s output data to host memory.
cl.enqueue_copy(queue, matrix_dot_vector, destination_buf)
# cl.enqueue_copy(queue, global_size_of_GPU, threads_size_buf)
print(matrix_dot_vector)
# print(global_size_of_GPU)

# COPY SAME ARRAY FROM GPU AGAIN
cl.enqueue_copy(queue, matrix_dot_vector, destination_buf)
print(matrix_dot_vector)
print('copied same array twice')
  1. How can i free the memory in matrix_buf & destination_buf on GPU. One is read only and other is write only.
  2. How can i load different matrix array in same matrix_buf, without having to create new buffer in pyopencl. I read that if I load new data in same buffer its much faster then recreating same size buffers each time.
  3. Is it ok if new array that I load in old buffer is smaller in size then the old array that was in that buffer. does the new array have to be of exact same size of the buffer?

Upvotes: 0

Views: 1775

Answers (2)

Aseem
Aseem

Reputation: 6779

  1. matrix_buf.release() & destination_buf.release() - this will release the memory assigned for respective buffers in the GPU. Its better to release memory if its of no use, in order to avoid running into memory errors. If the GPU function exits, all GPU memory is cleared by pyopencl automatically. -{by doqtor}
  2. cl.enqueue_copy(queue, matrix_buf, matrix_2) - loading a new matrix_2 array into matrix_buf without recreating a new matrix buf.
  3. It's ok to reuse existing buffer and use part of it. On the kernel side we have a control over which part we want to access. -{by doqtor}

Upvotes: 1

doqtor
doqtor

Reputation: 8484

  • Re 1. I believe the buffer will be released when the buffer's variable is out of scope or you can call explicitly release(). Whether buffer is read or write only is not important in this case.
  • Re 2. Try pyopencl.enqueue_map_buffer() which returns access to an array which can be modified from the host side. More here.
  • Re 3. It's ok if you want to reuse existing buffer and use part of it. On the kernel side you have a control over which part you want to access.

Upvotes: 0

Related Questions