pierre tautou
pierre tautou

Reputation: 817

OpenCL Bus error

I have problem with my OpenCL code. I compile and running it on CPU (core 2 duo) Mac OS X 10.6.7. Here is the code:

#define BUFSIZE (524288)    // 512 KB
#define BLOCKBYTES (32)    // 32 B
__kernel void test(__global unsigned char *in,
                   __global unsigned char *out,
                   unsigned int srcOffset,
                   unsigned int dstOffset) {
    int grId = get_group_id(0);
    unsigned char msg[BUFSIZE];
    srcOffset = grId * BUFSIZE;
    dstOffset = grId * BLOCKBYTES;

    // Copy from global to private memory
    size_t i;
    for (i = 0; i < BUFSIZE; i++)
        msg[i] = in[ srcOffset + i ];

    // Make some computation here, not complicated logic    

    // Copy from private to global memory
    for (i = 0; i < BLOCKBYTES; i++)
        out[ dstOffset + i ] = msg[i];
}

The code gave me an runtime error "Bus error". When I makes help printf in cycle which copy from global to private memory then see there the error occurs, every time in different iteration of i. When I reduce size of BUFSIZE to 262144 (256 KB) then the code runs fine. I tried to have only one work-item on one work-group. The *in points to memory area which have thousands KB of data. I suspect to limit of private memory, but then threw an error in the allocation of memory, not when copy.

Here is my OpenCL device query:

-

--------------------------------
 Device Intel(R) Core(TM)2 Duo CPU     P7550  @ 2.26GHz
 ---------------------------------
  CL_DEVICE_NAME:           Intel(R) Core(TM)2 Duo CPU     P7550  @ 2.26GHz
  CL_DEVICE_VENDOR:             Intel
  CL_DRIVER_VERSION:            1.0
  CL_DEVICE_VERSION:            OpenCL 1.0 
  CL_DEVICE_TYPE:           CL_DEVICE_TYPE_CPU
  CL_DEVICE_MAX_COMPUTE_UNITS:      2
  CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS:   3
  CL_DEVICE_MAX_WORK_ITEM_SIZES:    1 / 1 / 1 
  CL_DEVICE_MAX_WORK_GROUP_SIZE:    1
  CL_DEVICE_MAX_CLOCK_FREQUENCY:    2260 MHz
  CL_DEVICE_ADDRESS_BITS:       32
  CL_DEVICE_MAX_MEM_ALLOC_SIZE:     1024 MByte
  CL_DEVICE_GLOBAL_MEM_SIZE:        1535 MByte
  CL_DEVICE_ERROR_CORRECTION_SUPPORT:   no
  CL_DEVICE_LOCAL_MEM_TYPE:     global
  CL_DEVICE_LOCAL_MEM_SIZE:     16 KByte
  CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE:   64 KByte
  CL_DEVICE_QUEUE_PROPERTIES:       CL_QUEUE_PROFILING_ENABLE
  CL_DEVICE_IMAGE_SUPPORT:      1
  CL_DEVICE_MAX_READ_IMAGE_ARGS:    128
  CL_DEVICE_MAX_WRITE_IMAGE_ARGS:   8
  CL_DEVICE_SINGLE_FP_CONFIG:       denorms INF-quietNaNs round-to-nearest 

  CL_DEVICE_IMAGE <dim>         2D_MAX_WIDTH     8192
                    2D_MAX_HEIGHT    8192
                    3D_MAX_WIDTH     2048
                    3D_MAX_HEIGHT    2048
                    3D_MAX_DEPTH     2048

  CL_DEVICE_EXTENSIONS:         cl_khr_fp64
                    cl_khr_global_int32_base_atomics
                    cl_khr_global_int32_extended_atomics
                    cl_khr_local_int32_base_atomics
                    cl_khr_local_int32_extended_atomics
                    cl_khr_byte_addressable_store
                    cl_APPLE_gl_sharing
                    cl_APPLE_SetMemObjectDestructor
                    cl_APPLE_ContextLoggingFunctions

  CL_DEVICE_PREFERRED_VECTOR_WIDTH_<t>  CHAR 16, SHORT 8, INT 4, LONG 2, FLOAT 4, DOUBLE 2

Upvotes: 2

Views: 758

Answers (1)

Rick-Rainer Ludwig
Rick-Rainer Ludwig

Reputation: 2401

You use a variable msg with a size of 512kB. This variable should be in private memory. The private memory is not that big. This shouldn't work, as far as I know.

Why do you have the parameters srcOffsetand dstOffset? You do not use them.

I do not see more issues. Try to allocate local memory. Do you have a version of you code without optimization running? A version which just calculates in global memory?

Upvotes: 1

Related Questions