Reputation: 69
I'm converting OpenCL code from my Mac to a Linux box with an NVIDIA Tesla K20c card and have run into a snag when building a simple kernel. My kernel code is this:
char kernel[1024] =
"#pragma OPENCL EXTENSION cl_khr_fp64: enable \
\
kernel void diff(global double* u, \
int N, \
double dx, \
global double* du) \
{ \
size_t i = get_global_id(0); \
int ip = (i+1)%N; \
int im = (i+N-1)%N; \
du[i] = (u[ip] - u[im])/dx/2.; \
}";
I call this with:
const char* srccode = kernel;
cl_program program = clCreateProgramWithSource(context, 1, &srccode, NULL, &err);
err = clBuildProgram(program, 0, NULL, NULL, NULL, NULL);
kernel = clCreateKernel(program, "diff", &err);
clBuildProgram
returns CL_SUCCESS
and the log from clBuildProgramInfo
is empty, but clCreateKernel
returns CL_INVALID_KERNEL_NAME
. Any idea why? I've been banging at this a while and can't find anything. If I change all the doubles to floats and remove the pragma the problem goes away and it works correctly. So is the pragma to blame? If so, how do I do it correctly?
Upvotes: 1
Views: 2924
Reputation: 8484
The whole kernel[1024]
string will end up in one line which is fine for the kernel definition but not for pragma
- there you need the end line character. Fixed version would look like this:
char kernel[1024] =
"#pragma OPENCL EXTENSION cl_khr_fp64: enable \n\
\
kernel void diff(global double* u, \
int N, \
double dx, \
global double* du) \
{ \
size_t i = get_global_id(0); \
int ip = (i+1)%N; \
int im = (i+N-1)%N; \
du[i] = (u[ip] - u[im])/dx/2.; \
}";
Upvotes: 3