Reputation: 1430
I'm trying to query the CUDA devices without adding the pycuda dependency. Here's what I've got so far:
import ctypes
cudart = ctypes.cdll.LoadLibrary('libcudart.so')
numDevices = ctypes.c_int()
cudart.cudaGetDeviceCount(ctypes.byref(numDevices))
print 'There are', numDevices.value, 'devices.'
for x in xrange(numDevices.value):
properties = None # XXX What goes here?
cudart.cudaGetDeviceProperties(ctypes.byref(properties), x)
print properties
The problem is that I can't create an empty struct to pass to cudaGetDeviceProperties(). I want to do something like this:
properties = cudart.cudaDeviceProp
But that throws this error:
AttributeError: /usr/local/cuda/lib64/libcudart.so: undefined symbol: cudaDeviceProp
Here is the relevant CUDA documentation.
(edit)
Thanks to @mhawke, I got this working. For anyone else who wants to do this, I'll save you the work of typing up the class yourself:
class CudaDeviceProp(ctypes.Structure):
_fields_ = [
('name', ctypes.c_char * 256),
('totalGlobalMem', ctypes.c_size_t),
('sharedMemPerBlock', ctypes.c_size_t),
('regsPerBlock', ctypes.c_int),
('warpSize', ctypes.c_int),
('memPitch', ctypes.c_size_t),
('maxThreadsPerBlock', ctypes.c_int),
('maxThreadsDim', ctypes.c_int * 3),
('maxGridSize', ctypes.c_int * 3),
('clockRate', ctypes.c_int),
('totalConstMem', ctypes.c_size_t),
('major', ctypes.c_int),
('minor', ctypes.c_int),
('textureAlignment', ctypes.c_size_t),
('texturePitchAlignment', ctypes.c_size_t),
('deviceOverlap', ctypes.c_int),
('multiProcessorCount', ctypes.c_int),
('kernelExecTimeoutEnabled', ctypes.c_int),
('integrated', ctypes.c_int),
('canMapHostMemory', ctypes.c_int),
('computeMode', ctypes.c_int),
('maxTexture1D', ctypes.c_int),
('maxTexture1DMipmap', ctypes.c_int),
('maxTexture1DLinear', ctypes.c_int),
('maxTexture2D', ctypes.c_int * 2),
('maxTexture2DMipmap', ctypes.c_int * 2),
('maxTexture2DLinear', ctypes.c_int * 3),
('maxTexture2DGather', ctypes.c_int * 2),
('maxTexture3D', ctypes.c_int * 3),
('maxTexture3DAlt', ctypes.c_int * 3),
('maxTextureCubemap', ctypes.c_int),
('maxTexture1DLayered', ctypes.c_int * 2),
('maxTexture2DLayered', ctypes.c_int * 3),
('maxTextureCubemapLayered', ctypes.c_int * 2),
('maxSurface1D', ctypes.c_int),
('maxSurface2D', ctypes.c_int * 2),
('maxSurface3D', ctypes.c_int * 3),
('maxSurface1DLayered', ctypes.c_int * 2),
('maxSurface2DLayered', ctypes.c_int * 3),
('maxSurfaceCubemap', ctypes.c_int),
('maxSurfaceCubemapLayered', ctypes.c_int * 2),
('surfaceAlignment', ctypes.c_size_t),
('concurrentKernels', ctypes.c_int),
('ECCEnabled', ctypes.c_int),
('pciBusID', ctypes.c_int),
('pciDeviceID', ctypes.c_int),
('pciDomainID', ctypes.c_int),
('tccDriver', ctypes.c_int),
('asyncEngineCount', ctypes.c_int),
('unifiedAddressing', ctypes.c_int),
('memoryClockRate', ctypes.c_int),
('memoryBusWidth', ctypes.c_int),
('l2CacheSize', ctypes.c_int),
('maxThreadsPerMultiProcessor', ctypes.c_int),
('streamPrioritiesSupported', ctypes.c_int),
('globalL1CacheSupported', ctypes.c_int),
('localL1CacheSupported', ctypes.c_int),
('sharedMemPerMultiprocessor', ctypes.c_size_t),
('regsPerMultiprocessor', ctypes.c_int),
('managedMemSupported', ctypes.c_int),
('isMultiGpuBoard', ctypes.c_int),
('multiGpuBoardGroupID', ctypes.c_int),
('singleToDoublePrecisionPerfRatio', ctypes.c_int),
('pageableMemoryAccess', ctypes.c_int),
('concurrentManagedAccess', ctypes.c_int),
]
Upvotes: 0
Views: 885
Reputation: 87074
You need to define a subclass of ctypes.Structure
that specifies all of the fields in a cudaDeviceProp
struct. Then you can pass an instance of the structure to the function. Note that you need to fill in all the fields in the correct order. Some of them are arrays, so you need to declare those properly.
import ctypes
class CudaDeviceProp(ctypes.Structure):
_fields_ = [('ECCEnabled', ctypes.c_int),
('asyncEngineCount', ctypes.c_int),
('canMapHostMemory', ctypes.c_int),
('clockRate', ctypes.c_int),
('computeMode', ctypes.c_int),
('concurrentKernels', ctypes.c_int),
...
('totalGlobalMem', ctypes.c_size_t),
('unifiedAddressing', ctypes.c_int),
('warpSize', ctypes.c_int)]
properties = CudaDeviceProp()
cudart.cudaGetDeviceProperties(ctypes.byref(properties), 0)
Upvotes: 2