Reputation: 5073
In my application I am showing all available OpenCL devices so that the user can select the devices on which he wants to perform the computation. The results I am getting on my laptop have left me befuddled.
Following is an excerpt of the code that produced these results:
//CL_DEVICE_TYPE
{
cl_device_type devtype;
QString temp = "Unknown";
err = clGetDeviceInfo(devices[i][j], CL_DEVICE_TYPE, sizeof(devtype), &devtype, NULL);
if(err == CL_SUCCESS)
{
if(devtype == CL_DEVICE_TYPE_CPU)
temp = "CPU";
else if(devtype == CL_DEVICE_TYPE_GPU)
temp = "GPU";
else if(devtype == CL_DEVICE_TYPE_ACCELERATOR)
temp = "Accelerator";
else
temp = "Unkown";
}
ilist->append(temp);
}
//CL_DEVICE_MAX_CLOCK_FREQUENCY
{
cl_uint devfreq;
err = clGetDeviceInfo(devices[i][j], CL_DEVICE_MAX_CLOCK_FREQUENCY, sizeof(devfreq), &devfreq, NULL);
if(err == CL_SUCCESS)
ilist->append(QString::number((unsigned int)devfreq));
else
ilist->append("Unknown");
}
//CL_DEVICE_GLOBAL_MEM_SIZE
{
cl_ulong devmem;
err = clGetDeviceInfo(devices[i][j], CL_DEVICE_GLOBAL_MEM_SIZE, sizeof(devmem), &devmem, NULL);
devmem /= 1000000;
if(err == CL_SUCCESS)
ilist->append(QString::number((unsigned int)(devmem)));
else
ilist->append("Unkown");
}
//CL_DEVICE_MAX_COMPUTE_UNITS * CL_DEVICE_MAX_WORK_GROUP_SIZE
{
cl_uint devcores;
err = clGetDeviceInfo(devices[i][j], CL_DEVICE_MAX_COMPUTE_UNITS, sizeof(devcores), &devcores, NULL);
if(err == CL_SUCCESS)
{
size_t devcores2;
err = clGetDeviceInfo(devices[i][j], CL_DEVICE_MAX_WORK_GROUP_SIZE, sizeof(devcores2), &devcores2, NULL);
if(err == CL_SUCCESS)
ilist->append(QString::number(((unsigned int)(devcores)) * ((unsigned int)(devcores2))));
else
ilist->append("Unkown");
}
else
ilist->append("Unknown");
}
What I do not understand is the Memory and the no of parallel computations shown for the CPU. Any idea why I am getting these results?
Upvotes: 1
Views: 159
Reputation: 1814
It's a complicated task - to measure device performance.
Metrics you used are not suitable to determine how fast device is. Moreover, simple tasks like matrix multiplication doesnt show it either. You need to use benchmarks to determine computing capabilities.
Upvotes: 1
Reputation: 6333
It is because CL_DEVICE_MAX_WORK_GROUP_SIZE is not an indicator of parallel computation ability.
Upvotes: 2