Reza
Reza

Reputation: 21

CUDA- using thread's index to access array elements more than one time

I wanted to use thread id to access an array which is defined as a global variable. But I face the problem in summing by one. Take a look below:

// initial array myU[0..3]={0,0,0,0}, myindex[0..3]={0,1,1,3}
1- tid=0,1,2,3 //tid is threads index
2- id=myindex[tid]; //id=0,1,1,3
3- myU[id]=myU[id]+1; 
4- if (myU[id]>1)
     //print("id"); // it should print '1'

I supposed after running line 3 I have myU[0]=1,myU[1]=2,myU[3]=1. But myU array has some strange value, like as: myU[0]=0, myU[1]=1, myU[3]=3. I don't know why.

My final goal is to have the id(in line 4), which they summed by one, more than one time).

Upvotes: 0

Views: 671

Answers (1)

a.lasram
a.lasram

Reputation: 4411

If myU[1] is written by 2 different threads then the result is undefined and you need to use atomicAdd to obtain myU[1]==2

CUDA programming guide states:

If a non-atomic instruction executed by a warp writes to the same location in global or shared memory for more than one of the threads of the warp, the number of serialized writes that occur to that location varies depending on the compute capability of the device and which thread performs the final write is undefined.

Upvotes: 4

Related Questions