Jaysmito Mukherjee
Jaysmito Mukherjee

Reputation: 1526

OpenCL Kernel working fine on intel but not on nvdia GPU?

My application uses OpenCL. I tested on my machine it was working fine but on a user's machine the kernels are just not working.

My Machine:

OpenCL Status : Using OpenCL Platform : Intel(R) OpenCL HD Graphics
OpenCL Status : Using GPU Device : Intel(R) HD Graphics 510

Users Machine:

OpenCL Status : Using OpenCL Platform : NVIDIA CUDA
OpenCL Status : Using GPU Device : GeForce GT 730

Code C++

kernels = new ComputeKernel();
std::string source = ReadShaderSourceFile(GetExecutableDir() + "\\Data\\kernels\\generators\\generators.cl", &tmp);
    kernels->AddSoruce(source);
    kernels->BuildProgram("-I" + appState->globals.kernelsIncludeDir + " -cl-fast-relaxed-math -cl-mad-enable");
    kernels->AddKernel("clear_mesh_terrain");
kernels->CreateBuffer("mesh", CL_MEM_READ_WRITE, appState->models.customBase->mesh->vertexCount * sizeof(Vert));
        kernels->WriteBuffer("mesh", true, appState->models.customBase->mesh->vertexCount * sizeof(Vert), appState->models.customBase->mesh->vert);
kernels->SetKernelArg("clear_mesh_terrain", 0, "mesh");

        kernels->ExecuteKernel("clear_mesh_terrain", cl::NDRange(1), 

cl::NDRange(appState->models.coreTerrain->mesh->vertexCount));

The ComputeKernel class:

void ComputeKernel::AddSoruce(std::string source)
{
    sources.push_back({source.c_str(), source.size()});
}

void ComputeKernel::BuildProgram(std::string options)
{
    program = cl::Program(context, sources);

    if (program.build({ device }, options.c_str()) != CL_SUCCESS)
    {
        onStatus("Error Building : " + program.getBuildInfo<CL_PROGRAM_BUILD_LOG>(device));
        return;
    }
}

void ComputeKernel::AddKernel(std::string name)
{
    kernels[name] = cl::Kernel(program, name.c_str());
}

void ComputeKernel::Clear()
{
    sources.clear();
    kernels.clear();
}

void ComputeKernel::ExecuteKernel(std::string name, cl::NDRange local, cl::NDRange global)
{
    queue.enqueueNDRangeKernel(kernels[name], cl::NullRange, global, local);
    queue.finish();
}

void ComputeKernel::CreateBuffer(std::string name, int type, size_t size)
{
    OpenCLBuffer buffer;
    buffer.size = size;
    buffer.buffer = cl::Buffer(context, type, size);
    buffers[name] = buffer;
}

void ComputeKernel::SetKernelArg(std::string name, int arg, std::string buffer)
{
    kernels[name].setArg(arg, buffers[buffer].buffer);
}

void ComputeKernel::ReadBuffer(std::string buffer, bool blocking, size_t size, void* data)
{
    queue.enqueueReadBuffer(buffers[buffer].buffer, blocking ? CL_TRUE : CL_FALSE, 0, size, data);
}

void ComputeKernel::WriteBuffer(std::string buffer, bool blocking, size_t size, void* data)
{
    queue.enqueueWriteBuffer(buffers[buffer].buffer, blocking ? CL_TRUE : CL_FALSE, 0, size, data);
}

The Kernel:

__kernel void clear_mesh_terrain(__global Vert* mesh)
{
    int i = get_global_id(0);
    mesh[i].normal.x = 0.0f;
    mesh[i].normal.y = 0.0f;
    mesh[i].normal.z = 0.0f;
    mesh[i].normal.w = 0.0f;
    mesh[i].position.y = 0.0f;

}

Now by not working i mean nothing is hapenning at all(kernels are not getting executed at all) but there are no errors no compile failure of kernels.

Upvotes: 0

Views: 244

Answers (1)

ProjectPhysX
ProjectPhysX

Reputation: 5736

The most likely explanation is that you are running out of memory on your Nvidia GT 730. Some of the Buffers fail to allocate, the Kernel runs through with 0 execution time and yet does nothing.

Keep track of how much memory you allocate in total. It should still work with smaller Buffers.

Upvotes: 1

Related Questions