Reputation: 113
I'm new to CUDA , and I want to implement a sum of multiplication as this equation
I wrote this code in CUDA , but it doesn't give the correct answer
mulFV1[idx] = f[idx][idy]*compV2[idy];
mulFV2[idy] = f[idx][idy]*compV1[idx];
and then , I send the the arrays mulFV1 and mulFV2 to a reduction device function..
The question is how can I debug it?
Note :To be in the picture mulFV1 is concern in the rows and mulFV2 concern in the columns
Upvotes: 0
Views: 1084
Reputation: 399
I think, that you kernel may be look like this following
__global__ void kernel_code(const int* f,const int* v1,const int* v2, int* outv1, int* outv2)
{
int idx = blockIdx.x * blockDim.x + threadIdx.x;
int idy = blockIdx.y * blockDim.y + threadIdx.y;
if (idx<MAX_X && idy <MAX_Y)
{
if(idx==0)
{
outv2[idy]=0;
}
if(idy==0)
{
outv1[idx]=0;
}
__syncthreads();
atomicAdd(&(outv1[idx]),f[idy*MAX_Y+ idx]*v2[idy]);
atomicAdd(&(outv2[idy]),f[idy*MAX_Y+idx]*v1[idx]);
}
}
Upvotes: 1
Reputation: 39197
Your variable names indicate that the first line is the multiplication using vector v1
and the second with v2
. But instead you are doing it cross-over. Maybe you want to have
mulFV1[idx] = f[idx][idy]*compV1[idy];
mulFV2[idy] = f[idx][idy]*compV2[idx];
with indices 1 and 2 exchanged?
Upvotes: 0