AsmCoder8088
AsmCoder8088

Reputation: 51

OpenCL: Passing a pointer to local memory

I have the following example code:

int compute_stuff(int *array)
{
    /* do stuff with array */
    ...
    return x;
}

__kernel void my_kernel()
{
    __local int local_mem_block[LENGTH*MY_LOCAL_WORK_SIZE];
    int result;

    /* do stuff with local memory block */
    result = compute_stuff(local_mem_block + (LENGTH*get_local_id(0)));
    ...
}

The above example compiles and executes fine on my NVIDIA card (RTX 2080).
But when I try to compile on a Macbook with AMD card, I get the following error:

error: passing '__local int *' to parameter of type '__private int *' changes address space of pointer

OK, so then I change the "compute_stuff" function to the following:

int compute_stuff(__local int *array)

Now both NVIDIA and AMD compile it fine, no problem... But then I have one more test, to compile it on the same Macbook using WINE (rather than boot to Windows in bootcamp), and it gives the following error:

error: parameter may not be qualified with an address space

So it seems as though one is not supposed to qualify a function parameter with an address space. Fair enough. But if I do not do that, then the AMD on native Windows thinks that I am trying to change the address space of the pointer to private (I guess because it assumes that all function arguments will be private?).

What is a good way to handle this so that all three environments are happy to compile it? As a last resort, I am thinking of simply having the program check to see if the build failed without qualifier, and if so, substitute in the "__local" qualifier and build a second time... Seems like a hack, but it could work.

Upvotes: 3

Views: 982

Answers (2)

ProjectPhysX
ProjectPhysX

Reputation: 5764

The int* in int compute_stuff(int *array) is __generic address space. The call result = compute_stuff(local_mem_block+...); implicitly converts it to __local, which is allowed according to the OpenCL 2.0 Khronos specification.

It could be that AMD defaults to OpenCL 1.2. Maybe explicitely set –cl-std=CL2.0 in clBuildProgram() or clCompileProgram().

To keep the code compatible with OpenCL 1.2, you can explicitly set the pointer in the function to __local: int compute_stuff(__local int *array). OpenCL allows to set function parameters to the address spaces __global and __local. WINE seems to have a bug here. Maybe inlining the function can solve it: int __attribute__((always_inline)) compute_stuff(__local int *array).

As a last resort, you can do your proposed method. You can detect if it runs on WINE system like this. With that, you could switch between the two code variants without compiling twice and detecting the error.

Upvotes: 1

AsmCoder8088
AsmCoder8088

Reputation: 51

I agree with ProjectPhysX that it appears to be a bug with the WINE implementation. I also found the following appears to satisfy all three environments:

int compute_stuff(__local int * __private array)
{
    ...
}

__kernel void my_kernel()
{
    __local int local_mem_block[LENGTH*MY_LOCAL_WORK_SIZE];
    __local int * __private samples;

    samples = local_mem_block + (LENGTH*get_local_id(0));

    result = compute_stuff(samples);
}

The above is explicitly stating that the pointer itself is private while the memory it is pointing to is kept in local address space. So this removes any ambiguity.

Upvotes: 1

Related Questions