How to set element of array to zero by index in cuda?

Question

I am trying with cuda to set some elements in array by index to zero. My array size has about 7,000,000 elements. The index length is about 1,000. So I want to write the kernel code efficiently. The only technique I know is to set the block size by cudaOccupancyMaxPotentialBlockSize. Could any one give me some suggestion to speed up?

e.g. The pointer of the array a is double *a, with size n. The index's pointer is int * index, with length n1.

__global__ void setZero(int n, double * a,int n1, const int* index)
{
  int i = threadIdx.x + blockIdx.x * blockDim.x;
  if (i>>(n, d_a, n1, d_index);
}

As a mini sample, a = {1,2,3,4,5}, index = [2,4]. The output is a = {1,0,3,0,5}.

dreamcrash · Accepted Answer

Given your constrains I think the following would already be good enough:

__global__ void setZero(int n, double *a, int n1, const int* index, const int* index_size)
{
  int id = threadIdx.x + blockIdx.x * blockDim.x;
  if (id < index_size)
     a[index[id]]=0
}

How to set element of array to zero by index in cuda?

Answers (1)

Related Questions