Reputation: 61
In a single GPU such as P100 there are 56 SMs(Streaming Multiprocessors), and different SMs may have little correlation .I would like to know the application performance variation with different SMs.So it there any way to disable some SMs for a certain GPU. I know CPU offer the corresponding mechanisms but have get a good one for GPU yet.Thanks!
Upvotes: 0
Views: 350
Reputation: 151849
There are no CUDA-provided methods to disable a SM (streaming multiprocessor). With varying degrees of difficulty and behavior, some possibilities exist to try this using indirect methods:
Use CUDA MPS, and launch an application that "occupies" fully one or more SMs, by carefully controlling number of blocks launched and resource utilization of those blocks. With CUDA MPS, another application can run on the same GPU, and the kernels can run concurrently, assuming sufficient care is taken for it. This might allow for no direct modification of the application code under test (but an additional application launch is needed, as well as MPS). The kernel duration will need to be "long" so as to occupy the SMs while the application under test is running.
In your application code, effectively re-create the behavior listed in item 1 above by launching the "dummy" kernel from the same application as the code under test, and have the dummy kernel "occupy" one or more SMs. The application under test can then launch the desired kernel. This should allow for kernel concurrency without MPS.
In your application code, for the kernel under test itself, modify the kernel block scheduling behavior, probably using the smid
special register via inline PTX, to cause the application kernel itself to only utilize certain SMs, effectively reducing the total number in use.
Upvotes: 4