Allocation error with pyopencl with simple multiplication in for-loop

Question

I am using pyopencl to speed up my calculations using a GPU and am at the moment mystified by the following problem.

Im doing a simple multiplication of two arrays in a for loop using the following code

import numpy as np
import pyopencl as cl
import pyopencl.array as cl_array
from pyopencl.elementwise import ElementwiseKernel

ctx = cl.create_some_context(0)
queue = cl.CommandQueue(ctx)

multiply = ElementwiseKernel(ctx,
           "float *x, float *y, float *z",
           "z[i] = x[i] * y[i]",
           "multiplication")

x = cl_array.arange(queue, 1000000, dtype=np.complex64)
y = cl_array.arange(queue, 1000000, dtype=np.complex64)
z = cl_array.empty_like(x)

for n in range(10000):
    z = x*y
    multiply(x.real, y.real, z.real)
    multiply(x, y, z)

The last three lines do of course the same thing namely the multiplication. However, the first two options result in the following error (I commented out the other two of course):

pyopencl.MemoryError: clEnqueueNDRangeKernel failed: mem object allocation failure

I'm just lost why the first two options are running into allocation errors.

NOTES:

GPU: [0] pyopencl.Device 'Capeverde' on 'AMD Accelerated Parallel Processing' at 0x2a76d90

>>> pyopencl.VERSION
(2013, 1)

I am aware that the complex type is not handled correctly, but if you change them into np.float32 I still get the same problem.

Allocation error with pyopencl with simple multiplication in for-loop

Answers (1)

Related Questions