OpenCL kernel argument struct has zero values

Question

I'm having several problems regarding OpenCL (total noob) but I think that if I manage to solve this one I will be able to solve some of the other. I have the following kernel that I want to store in a double array the a number calculated by the data of a struct. The argument that I pass to the kernel is a struct array and is initialised and the values are non zero (I tested it).

When executing the kernel though I get a "Floating point exception". If I got it right it means that the local_density variable is zero and the division causes an error. What I don't get is why it is zero since in the host the values of non-zero. Am I doing something wrong in the kernel?

#pragma OPENCL EXTENSION cl_khr_fp64 : enable
typedef struct
{
double speeds[9];
} t_speed;

__kernel void prepare(__global const t_speed* cells,
                  __global const int*     obstacles,
                  __global       double*  results,
                           const unsigned int count)
{
  int pos = get_global_id(0);
  if(pos >= count) return;
  if(obstacles[pos] == 1) results[pos] = 0.00;
  else
  {
    double local_density = 0.00;
    for(int kk = 0; kk < 9; kk++)
      local_density += cells[pos].speeds[kk];
    results[pos] = (cells[pos].speeds[1] + cells[pos].speeds[5] +
                    cells[pos].speeds[8] - (cells[pos].speeds[3] +
                    cells[pos].speeds[6] + cells[pos].speeds[7])) /
                    local_density;
  }
}

Here is also the initialization of the variable that I pass as an argument. params->ny/nx have correct values.

cells = (t_speed*) malloc(sizeof(t_speed) * (params->ny * params->nx));

Also I quote the argument setting for the kernel for the cells variable.

m_cells = clCreateBuffer(context, CL_MEM_READ_ONLY, sizeof(t_speed) * count, NULL, NULL);
err  = clEnqueueWriteBuffer(commands, m_cells, CL_TRUE, 0, sizeof(t_speed) * count, cells, 0, NULL, NULL);
err |= clSetKernelArg(av_velocity_prepare_kernel,  0, sizeof(cl_mem), &m_cells);

------------------------------------------ EDIT ------------------------------------------

OK, what is really weird is that I'm getting the same error (Floating point exception) even with the very simple following kernel. Anyone has got a clue?

#pragma OPENCL EXTENSION cl_khr_fp64 : enable
__kernel void test(__global float*  result, const unsigned int n)
{
  int i = get_global_id(0);
  if(i >= n) return;
  result[i] += 1.0f;
}

George Karanikas · Accepted Answer

OK, so it was a completely different thing than I thought it was. The problem was that when I was calling

clEnqueueNDRangeKernel (command_queue, kernel, work_dim, *global_work_offset,     
                        *global_work_size, *local_work_size, num_events_in_wait_list,
                        *event_wait_list, *event)

the global_work_size was not divisible by local_work_size. That caused the Floating point exception.

OpenCL kernel argument struct has zero values

Answers (2)

Related Questions