bor
bor

Reputation: 658

Basic programming sample of OpenCL from Apple fails to run on GPU

I started learning some basics about OpenCL a while ago and decided to give the "Basic programming sample" from Apple a go. I runs OK on CPU, but when I select GPU as the target device I get err = -45 from

err = gclExecKernelAPPLE(k, ndrange, &kargs);

This error code translates to CL_INVALID_PROGRAM_EXECUTABLE. Any idea how can I correct the sample code?

Automatically generated kernel.cl.c code looks like this (+ includes on top):

static void initBlocks(void);

// Initialize static data structures
static block_kernel_pair pair_map[1] = {
    { NULL, NULL }
};

static block_kernel_map bmap = { 0, 1, initBlocks, pair_map };

// Block function
void (^square_kernel)(const cl_ndrange *ndrange, cl_float* input, cl_float* output) =
^(const cl_ndrange *ndrange, cl_float* input, cl_float* output) {
    int err = 0;
    cl_kernel k = bmap.map[0].kernel;
    if (!k) {
        initBlocks();
        k = bmap.map[0].kernel;
    }
    if (!k)
        gcl_log_fatal("kernel square does not exist for device");
    kargs_struct kargs;
    gclCreateArgsAPPLE(k, &kargs);
    err |= gclSetKernelArgMemAPPLE(k, 0, input, &kargs);
    err |= gclSetKernelArgMemAPPLE(k, 1, output, &kargs);
    gcl_log_cl_fatal(err, "setting argument for square failed");

    err = gclExecKernelAPPLE(k, ndrange, &kargs);

    gcl_log_cl_fatal(err, "Executing square failed");
    gclDeleteArgsAPPLE(k, &kargs);
};

// Initialization functions
static void initBlocks(void) {
    const char* build_opts = " -cl-std=CL1.1";
    static dispatch_once_t once;
    dispatch_once(&once,
    ^{ int err = gclBuildProgramBinaryAPPLE("OpenCL/kernel.cl", "", &bmap, build_opts);
        if (!err) {
            assert(bmap.map[0].block_ptr == square_kernel && "mismatch block");
            bmap.map[0].kernel = clCreateKernel(bmap.program, "square", &err);
        }
    });
}

__attribute__((constructor))
static void RegisterMap(void) {
    gclRegisterBlockKernelMap(&bmap);
    bmap.map[0].block_ptr = square_kernel;
}

Upvotes: 4

Views: 1596

Answers (1)

gavinb
gavinb

Reputation: 20018

I saw this same problem when running under 10.7.3, while a machine on 10.7.5 worked fine. I noticed the CVMCompiler process was crashing after each invocation of my app.

Inspecting the stack trace, I noticed it was crashing when trying to parse the bitcode for compilation into native code. Since the parsing of the bitcode failed failed, there was no resulting compiled program for gclExecKernelAPPLE() to execute, hence the error.

Try upgrading to 10.7.5, or indeed 10.8 and the problem should go away. (I just tested this and it does indeed fix the problem.)

Upvotes: 1

Related Questions