Reputation: 3670
Because this is the computer I am using it has AMD, NVIDEA, and Intel platforms. How can I know which is the right platform to use on a users computer? What I have now is a loop that tries to create a platform, device, context, and queue for every platform. If it fails at any point it tries the next platform.
readKernel();
numPlatforms = getNumPlatforms(); TEST
platforms = getPlatforms(); TEST
for(int i = 0; i < numPlatforms; i++)
{
numDevices = getNumDevices(platforms[i]); TEST_AND_CONTINUE
devices = getDevices(platforms[i], numDevices); TEST_AND_CONTINUE
context = createContext(platforms[i], devices); TEST_AND_CONTINUE
queue = getCommandQueue(context, devices[0]); TEST_AND_CONTINUE
// all setup. can post info here -> getDeviceInfo(devices[0]);
break;
}
program = createProgram(context, source); TEST
buildProgram(program); TEST
kernel = buildKernel(program, appName); TEST
Is that a good way to do it or is there a better way?
Upvotes: 3
Views: 2715
Reputation: 2565
As usual with this kind of question, the answer is: It depends on your need. Or in other words, you need to define what is "the right platform".
Here are some cases I can think of (I'm sure anybody can find some others):
You developed your kernel using some features specific to a certain version of OCL. Using clGetPlatformInfo
, you query each platform present to find one that has the proper OCL version.
You optimized your kernel for a specific type of device (CPU, GPU). You filter the devices you are interested in using the appropriate flag (CL_DEVICE_TYPE_TYPENAME
) with clGetDeviceIDs
.
You want to parallelized as much as possible the computation, but you have to move a lot of data to the device. In that case you might have found out that running your kernel on an iGPU gives the best performance. Thanks to the function clGetDeviceInfo
and the flag CL_DEVICE_HOST_UNIFIED_MEMORY
you can determine if you have such a device available.
With the clGetDeviceInfo
function you can also query for a specific vendor extension that you want to use (flag: CL_DEVICE_EXTENSIONS
). Note that clGetPlatformInfo
provides also a list of extension supported by the platform.
You have several GPUs available and you want the one with the "best performance". Still with clGetDeviceInfo
you can query certain specifications of the device. Based on these specs you can make you choice. For instance you can found out if the device has cache (CL_DEVICE_GLOBAL_MEM_CACHE_TYPE
) and if yes how much (CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE
). You can also query the clock frequency (CL_DEVICE_MAX_CLOCK_FREQUENCY
) or how many compute units are available on the device (CL_DEVICE_MAX_COMPUTE_UNITS
).
Upvotes: 8
Reputation: 8410
Typically a good common use case is to:
You can refine the 3 and 4 points, to selec only the best GPU device depending on your needs with clGetDeviceInfo()
.
Upvotes: 4