Reputation: 1145
I am now debugging a project, still have not figure out which part is wrong, however I suspect a race condition occurs during the stream creation:
Considering the following code:
#pragma omp parallel num_threads(4)
{
int threadId = omp_get_thread_num();
cudaSetDevice(threadId);
cudaStream_t streams[20];
for (int i=0; i<20; ++i) cudaStreamCreate(streams+i);
};
Would that cause a potential race condition, e.g. different thread create streams with the same stream id yet on different device id?
Upvotes: 0
Views: 228
Reputation: 185
Have you tested to create the cudaStreams in a serial way? You can create the cudaStreams first serially, and then parallelize the code you need.
Upvotes: 1