Luke Yeager
Luke Yeager

Reputation: 1430

Passing c struct to a function using ctypes

I'm trying to query the CUDA devices without adding the pycuda dependency. Here's what I've got so far:

import ctypes

cudart = ctypes.cdll.LoadLibrary('libcudart.so')

numDevices = ctypes.c_int()
cudart.cudaGetDeviceCount(ctypes.byref(numDevices))
print 'There are', numDevices.value, 'devices.'

for x in xrange(numDevices.value):
    properties = None # XXX What goes here?
    cudart.cudaGetDeviceProperties(ctypes.byref(properties), x)
    print properties

The problem is that I can't create an empty struct to pass to cudaGetDeviceProperties(). I want to do something like this:

properties = cudart.cudaDeviceProp

But that throws this error:

AttributeError: /usr/local/cuda/lib64/libcudart.so: undefined symbol: cudaDeviceProp

Here is the relevant CUDA documentation.

(edit)

Thanks to @mhawke, I got this working. For anyone else who wants to do this, I'll save you the work of typing up the class yourself:

class CudaDeviceProp(ctypes.Structure):
    _fields_ = [ 
            ('name', ctypes.c_char * 256),
            ('totalGlobalMem', ctypes.c_size_t),
            ('sharedMemPerBlock', ctypes.c_size_t),
            ('regsPerBlock', ctypes.c_int),
            ('warpSize', ctypes.c_int),
            ('memPitch', ctypes.c_size_t),
            ('maxThreadsPerBlock', ctypes.c_int),
            ('maxThreadsDim', ctypes.c_int * 3), 
            ('maxGridSize', ctypes.c_int * 3), 
            ('clockRate', ctypes.c_int),
            ('totalConstMem', ctypes.c_size_t),
            ('major', ctypes.c_int),
            ('minor', ctypes.c_int),
            ('textureAlignment', ctypes.c_size_t),
            ('texturePitchAlignment', ctypes.c_size_t),
            ('deviceOverlap', ctypes.c_int),
            ('multiProcessorCount', ctypes.c_int),
            ('kernelExecTimeoutEnabled', ctypes.c_int),
            ('integrated', ctypes.c_int),
            ('canMapHostMemory', ctypes.c_int),
            ('computeMode', ctypes.c_int),
            ('maxTexture1D', ctypes.c_int),
            ('maxTexture1DMipmap', ctypes.c_int),
            ('maxTexture1DLinear', ctypes.c_int),
            ('maxTexture2D', ctypes.c_int * 2), 
            ('maxTexture2DMipmap', ctypes.c_int * 2), 
            ('maxTexture2DLinear', ctypes.c_int * 3), 
            ('maxTexture2DGather', ctypes.c_int * 2), 
            ('maxTexture3D', ctypes.c_int * 3), 
            ('maxTexture3DAlt', ctypes.c_int * 3), 
            ('maxTextureCubemap', ctypes.c_int),
            ('maxTexture1DLayered', ctypes.c_int * 2), 
            ('maxTexture2DLayered', ctypes.c_int * 3), 
            ('maxTextureCubemapLayered', ctypes.c_int * 2), 
            ('maxSurface1D', ctypes.c_int),
            ('maxSurface2D', ctypes.c_int * 2), 
            ('maxSurface3D', ctypes.c_int * 3), 
            ('maxSurface1DLayered', ctypes.c_int * 2), 
            ('maxSurface2DLayered', ctypes.c_int * 3), 
            ('maxSurfaceCubemap', ctypes.c_int),
            ('maxSurfaceCubemapLayered', ctypes.c_int * 2), 
            ('surfaceAlignment', ctypes.c_size_t),
            ('concurrentKernels', ctypes.c_int),
            ('ECCEnabled', ctypes.c_int),
            ('pciBusID', ctypes.c_int),
            ('pciDeviceID', ctypes.c_int),
            ('pciDomainID', ctypes.c_int),
            ('tccDriver', ctypes.c_int),
            ('asyncEngineCount', ctypes.c_int),
            ('unifiedAddressing', ctypes.c_int),
            ('memoryClockRate', ctypes.c_int),
            ('memoryBusWidth', ctypes.c_int),
            ('l2CacheSize', ctypes.c_int),
            ('maxThreadsPerMultiProcessor', ctypes.c_int),
            ('streamPrioritiesSupported', ctypes.c_int),
            ('globalL1CacheSupported', ctypes.c_int),
            ('localL1CacheSupported', ctypes.c_int),
            ('sharedMemPerMultiprocessor', ctypes.c_size_t),
            ('regsPerMultiprocessor', ctypes.c_int),
            ('managedMemSupported', ctypes.c_int),
            ('isMultiGpuBoard', ctypes.c_int),
            ('multiGpuBoardGroupID', ctypes.c_int),
            ('singleToDoublePrecisionPerfRatio', ctypes.c_int),
            ('pageableMemoryAccess', ctypes.c_int),
            ('concurrentManagedAccess', ctypes.c_int),
            ]

Upvotes: 0

Views: 885

Answers (1)

mhawke
mhawke

Reputation: 87074

You need to define a subclass of ctypes.Structure that specifies all of the fields in a cudaDeviceProp struct. Then you can pass an instance of the structure to the function. Note that you need to fill in all the fields in the correct order. Some of them are arrays, so you need to declare those properly.

import ctypes

class CudaDeviceProp(ctypes.Structure):
    _fields_ = [('ECCEnabled', ctypes.c_int),
                ('asyncEngineCount', ctypes.c_int),
                ('canMapHostMemory', ctypes.c_int),
                ('clockRate', ctypes.c_int),
                ('computeMode', ctypes.c_int),
                ('concurrentKernels', ctypes.c_int),
                ...
                ('totalGlobalMem', ctypes.c_size_t),
                ('unifiedAddressing', ctypes.c_int),
                ('warpSize', ctypes.c_int)]

properties = CudaDeviceProp()
cudart.cudaGetDeviceProperties(ctypes.byref(properties), 0)

Upvotes: 2

Related Questions