John Rambo
John Rambo

Reputation: 156

How to pass parameters to ocl kernel using pyopencl?

how to pass some parameters, which will be treated in the .cl file as a preprocessor define using pyopencl?

Meaning:

foo.cl

# define LIMIT 12
typedef struct {
    uint i[LIMIT];
} foomatic;

turns to

foo_nodefs.cl

typedef struct {
    uint i[LIMIT]; // python script passing LIMIT to set it
} foomatic;

Thanks,

John

Upvotes: 1

Views: 1700

Answers (1)

fjarri
fjarri

Reputation: 9726

Edit: extending the answer, making it maximally detailed.

There are two ways to do that:

  1. (metaprogramming) Add your preprocessor directives directly to the string with the source code, or even run your own preprocessor using some templating engine.

    import pyopencl as cl
    import numpy
    import numpy.linalg as la
    
    a = numpy.random.rand(50000).astype(numpy.float32)
    b = numpy.random.rand(50000).astype(numpy.float32)
    
    ctx = cl.create_some_context()
    queue = cl.CommandQueue(ctx)
    
    mf = cl.mem_flags
    a_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a)
    b_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=b)
    dest_buf = cl.Buffer(ctx, mf.WRITE_ONLY, b.nbytes)
    
    defines = """
        #define AXIS 0
        #define COEFF 1
        """
    
    prg = cl.Program(ctx,
        defines +
        """
        __kernel void sum(__global const float *a,
        __global const float *b, __global float *c)
        {
          int gid = get_global_id(AXIS);
          c[gid] = a[gid] + b[gid] + COEFF;
        }
        """).build()
    
    prg.sum(queue, a.shape, None, a_buf, b_buf, dest_buf)
    
    a_plus_b = numpy.empty_like(a)
    cl.enqueue_copy(queue, a_plus_b, dest_buf)
    
    print(la.norm(a_plus_b - (a+b+1)), la.norm(a_plus_b))
    
  2. (C-way) use options keyword of Program.build to pass build options directly to clBuildProgram():

    import pyopencl as cl
    import numpy
    import numpy.linalg as la
    
    a = numpy.random.rand(50000).astype(numpy.float32)
    b = numpy.random.rand(50000).astype(numpy.float32)
    
    ctx = cl.create_some_context()
    queue = cl.CommandQueue(ctx)
    
    mf = cl.mem_flags
    a_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a)
    b_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=b)
    dest_buf = cl.Buffer(ctx, mf.WRITE_ONLY, b.nbytes)
    
    prg = cl.Program(ctx, """
        __kernel void sum(__global const float *a,
        __global const float *b, __global float *c)
        {
          int gid = get_global_id(AXIS);
          c[gid] = a[gid] + b[gid] + COEFF;
        }
        """).build(options=['-D', 'AXIS=0', '-D', 'COEFF=1'])
    
    prg.sum(queue, a.shape, None, a_buf, b_buf, dest_buf)
    
    a_plus_b = numpy.empty_like(a)
    cl.enqueue_copy(queue, a_plus_b, dest_buf)
    
    print(la.norm(a_plus_b - (a+b+1)), la.norm(a_plus_b))
    

(I have used the modifed source code from the main page of PyOpenCL docs. Tested on pyopencl 2013.1).

Upvotes: 4

Related Questions