user1197918
user1197918

Reputation:

Computing Euclidean distances between 2 matrices in CUDA

I'm writing a program with CUDA and the problem is the following:

-and I proceed with the next row of A.

I've implemented it this way: I've got a grid made by ( n * m ) blocks, and 128 threads per block. ( 1 * 128 ).

The program is compiling, but the problem is that it doesn't gives good distances. I can't figure out what wrong...

PS: I have CUDA 6.0 with a NVIDIA GTX 650 (copute capability 3.0)

 __global__ void EuclidianDistances( float *A, float *B , float *C , int n , int m)
{
    // SIZE is equal to 128
__shared__ float accumResult[SIZE];
__shared__ float sA[SIZE];
__shared__ float sB[SIZE];

    // MAPPING
int bx = blockIdx.x;  // n
int by = blockIdx.y;  // m
int ty = threadIdx.y; // 128
int tx = threadIdx.x; // 1


sA[ty] = A [bx * SIZE + ty];
sB[ty] = B [by * SIZE + ty];
__syncthreads();


accumResult[ty] = (sA[ty] - sB[ty])*(sA[ty] - sB[ty]);
__syncthreads();


// Parallel tree-reduction
for (int stride = SIZE/2 ; stride < 0 ; stride >>= 1)
    if (ty < stride)
    {
        accumResult[ty] += accumResult [stride + ty];
          __syncthreads();
    }

    // Writing results to output matrix
if ((threadIdx.y == 0))
    C [bx * m + by] = accumResult[ty];
       __syncthreads();
}

Upvotes: 0

Views: 1635

Answers (1)

Marco A.
Marco A.

Reputation: 43662

The condition looks wrong:

for (int stride = SIZE/2 ; stride < 0 ; stride >>= 1)

assuming SIZE is 128 as you said, this will not be executed. Also the __synchthread inside the if statement might stall the entire thing


Edit: after reading OP's comments I realized this is a language problem.. here is a snippet:

#include <iostream>
using namespace std;

int main() {

    int SIZE = 128;

    for (int stride = SIZE/2 ; stride < 0 ; stride >>= 1)
        cout << "Hello I'm running" << endl;



    return 0;
}

http://ideone.com/AyhXYF

The output is: nothing. Take a look at the for loop syntax in C++, the second part is the condition that should last for the entire duration of the loop. If you start with a false condition, your loop is never going to be executed.

Upvotes: 1

Related Questions