Element Green
Element Green

Reputation: 442

Accessing structured data following a struct in OpenCL

Summary: Does OpenCL permit creating a pointer in a kernel function from a pointer to a structure and a byte offset to data after the structure in the same memory block?

I'm trying to better understand the limitations of OpenCL in regards to pointers and structures. A project I'm currently working on involves the processing of different kinds of signal nodes, which can have drastically different sized state data from one processing instance to the next. I'm starting with a Linux CPU low latency SCHED FIFO implementation first, so no memory allocation or system calls in processing threads, but trying to plan for an eventual OpenCL implementation.

With this in mind I started designing the algorithm to allocate all the state data as one block, which begins with a structure, and has additional data structures and arrays appended, being careful about proper alignment for data types. Integer offset fields in the structures indicate the byte positions in the buffer to additional data. So technically there aren't any pointers in the structures which would likely not work when passing the data from host to device. However, the resulting size of the state data will differ from one synthesis Node to the next, though the size wont change once they are allocated. I'm not sure if this breaks the "no variable length structures" rule of OpenCL or not.

Simple example (pseudo OpenCL code):

// Additional data following Node structure:
// cl_float fArray[fArrayLen];
// cl_uint iArray[iArrayLen];
typedef struct
{
  cl_float val1;
  cl_float val2;
  cl_uint fArrayOfs;
  cl_uint fArrayLen;
  cl_uint iArrayOfs;
  cl_uint iArrayLen;
  ...
} Node;

void
node_process (__global Node *node)
{
  __global cl_float *fArray;
  __global cl_uint *iArray;

  // Construct pointers to arrays following Node structure
  fArray = ((cl_uchar *)node) + node->fArrayOfs;
  iArray = ((cl_uchar *)node) + node->iArrayOfs;
  ...
}

If this isn't possible, does anyone have any suggestions on defining complex data structures which are somewhat dynamic in nature without passing dozens of pointers to kernel functions? The dynamic nature is only when they are allocated, not once the Kernel is processing. The only other option I can think of is defining the processing node state as a union and pass additional data structures as parameters to the Kernel function, but this is likely to turn into a huge number of function parameters. Or maybe a __local structure with pointers is permissible?

Upvotes: 1

Views: 277

Answers (1)

pmdj
pmdj

Reputation: 23446

Yes, this is allowed in OpenCL (as long as you stick to alignment rules, as you mentioned), you will however want to be very careful:

First,

fArray = ((cl_uchar *)node) + node->fArrayOfs;
           ^^^^^^^^^^

You've missed off the memory type here, make sure you include __global or it defaults to (IIRC) __private which takes you straight to the land of undefined behaviour. Generally, I recommend being explicit about memory type for all pointer declarations and types, as the defaults are often non-obvious.

Second, if you're planning to run this on GPUs, if the control flow and memory access patterns for adjacent work-items are very different, you are in for a bad time, performance wise. I recommend reading the GPU vendors' OpenCL performance optimisation guides before architecting the way you split up the work and design the data structures.

Upvotes: 2

Related Questions