Reputation: 31
There are four Tesla C2075 in my server, so I just tested the sample simpleMultiGPU that came with the SDK. Unexpectedly I got a segmentation fault. With cuda-gdb I found this fault occurred when cudaStreamCreate is called the second time. Here is the deviceQuery output:
$ deviceQuery
CUDA Driver = CUDART,
CUDA Driver Version = 4.2,
CUDA Runtime Version = 4.2,
NumDevs = 4,
Device = Tesla C2075,
Device = Tesla C2075
The driver version is ok, so why the second cudaStreamCreate doesn't work. can anybody help me?
Upvotes: 1
Views: 535
Reputation: 2060
I'd start with running nvidia-healthmon which can be downloaded from https://developer.nvidia.com/tesla-deployment-kit (it's a part of TDK).
Also output of nvidia-bug-report.sh
is always very helpful. Also log from nvidia-healthmon
(--log-file
flag) might give us some clue.
Are there any other applications that are failing? It'd be good to rule out other possibilities by running other apps from the SDK like vectorAdd
or matrixMul
.
Upvotes: 1