Reputation: 21
I wanted to use thread id to access an array which is defined as a global variable. But I face the problem in summing by one. Take a look below:
// initial array myU[0..3]={0,0,0,0}, myindex[0..3]={0,1,1,3}
1- tid=0,1,2,3 //tid is threads index
2- id=myindex[tid]; //id=0,1,1,3
3- myU[id]=myU[id]+1;
4- if (myU[id]>1)
//print("id"); // it should print '1'
I supposed after running line 3 I have myU[0]=1,myU[1]=2,myU[3]=1. But myU
array has some strange value, like as: myU[0]=0, myU[1]=1, myU[3]=3. I don't know why.
My final goal is to have the id(in line 4), which they summed by one, more than one time).
Upvotes: 0
Views: 671
Reputation: 4411
If myU[1]
is written by 2 different threads then the result is undefined and you need to use atomicAdd
to obtain myU[1]==2
CUDA programming guide states:
If a non-atomic instruction executed by a warp writes to the same location in global or shared memory for more than one of the threads of the warp, the number of serialized writes that occur to that location varies depending on the compute capability of the device and which thread performs the final write is undefined.
Upvotes: 4