Reputation: 3
So, I have to pass some data to a OpenCL kernel using PyOpenCL or some workaround using Python. The data is readed in the kernel-side as a struct and I can't change the kernel cuz it is working fine and is a part of a much bigger project that my code must work with.
The kernel looks like that:
typedef struct VglClStrEl{
float data[VGL_ARR_CLSTREL_SIZE];
int ndim;
int shape[VGL_ARR_SHAPE_SIZE];
int offset[VGL_ARR_SHAPE_SIZE];
int size;
} VglClStrEl;
typedef struct VglClShape{
int ndim;
int shape[VGL_ARR_SHAPE_SIZE];
int offset[VGL_ARR_SHAPE_SIZE];
int size;
} VglClShape;
__kernel void kernel(__global unsigned char* img_input,
__global unsigned char* img_output,
__constant VglClShape* img_shape,
__constant VglClStrEl* window)
{
// do what is needed
}
So, as you can see, the VglClShape and VglClStrEl structures, have different type arrays and static-bitsize variables.
The [1] workaround supports structs with only one type arrays(or I tragically failed to get a way to do it with multiple array types).
The [2] workaround is the PyOpenCL documentation reference for how pass Python data to a OpenCL kernel struct. This approach don't support arrays at all.
So, how can I pass the python data as the OpenCL kernel can read? I already have all the data on Python-side, and I just need to know how to pass it from the Python to the kernel.
Before you ask: I am using Python 3 and I CAN NOT CHANGE THE KERNEL.
And yes, the array sizes are static. You can assume something like that:
VGL_ARR_CLSTREL_SIZE=256;
VGL_ARR_SHAPE_SIZE=20;
[1] Passing struct with pointer members to OpenCL kernel using PyOpenCL
[2] https://documen.tician.de/pyopencl/howto.html#how-to-use-struct-types-with-pyopencl
Upvotes: 0
Views: 574
Reputation: 300
There is a hackish way to do this that requires some tedious byte wrangling. Presumably you are OK with deploying a small OpenCL probing kernel? (PyOpenCL does this under the hood for some ops in any case)
The basic idea is to:
The following kernel does the job:
__kernel void get_struct_sizes( __global uint *struct_sizes )
{
const uint global_id = get_global_id(0u)+get_global_id(1u)*get_global_size(0u);
VglClStrEl vgclstrel;
VglClShape vgclshape;
uint offset;
printf("In GPU (probing):\n Kernel instance = %d\n", global_id);
if (global_id==0) {
offset = (uint)&(vgclstrel.data);
struct_sizes[0] = (uint)sizeof(vgclstrel);
struct_sizes[1] = (uint)&(vgclstrel.ndim)-offset;
struct_sizes[2] = (uint)&(vgclstrel.shape)-offset;
struct_sizes[3] = (uint)&(vgclstrel.offset)-offset;
struct_sizes[4] = (uint)&(vgclstrel.size)-offset;
offset = (uint)&(vgclshape.ndim);
struct_sizes[5] = (uint)sizeof(vgclshape);
struct_sizes[6] = (uint)&(vgclshape.shape)-offset;
struct_sizes[7] = (uint)&(vgclshape.offset)-offset;
struct_sizes[8] = (uint)&(vgclshape.size)-offset;
}
return;
}
Execute this kernel and return struct_sizes
into vgclshape_sizes
, create this array:
img_shape = np.zeros((vgclshape_sizes[0]), dtype=np.uint8)
and copy into it what you need:
def copy_into_byte_array(value, byte_array, offset):
for i,b in enumerate(np.ndarray.tobytes(value)):
byte_array[i+offset] = b
copy_into_byte_array(ndim, img_shape, 0)
copy_into_byte_array(shape, img_shape, vgclshape_sizes[1])
copy_into_byte_array(offset, img_shape, vgclshape_sizes[2])
copy_into_byte_array(size, img_shape, vgclshape_sizes[3])
I've skipped some steps here; filling them in you'll find this approach works. I was able to pass a demo struct to a dummy copy of your inviolate kernel.
I would be interested to hear if there are more elegant ways to do any/all of these steps. I would also expect there will be problems with endianness etc that would otherwise be transparent. With luck you can work around them.
Upvotes: 1