Tal_
Tal_

Reputation: 761

Cublas_Dgemm() not giving me what I expect

I was trying out the library function Cublas_Dgemm() on a small matrix, but it's not giving me what I expect.

So I declare and initialize the matrix in the following way:

    double alpha = 1.0, beta = 0.0;
    double * sa = (double *)malloc(6*sizeof(double));
    double * sb = (double *)malloc(6*sizeof(double));
    double * sc = (double *)malloc(4*sizeof(double));

    for( a = 0; a<2; a++)
            for (b = 0; b < 3; b++){
                    sa[a*3+b] = a+b+1.0;
                    sb[a*3+b] = a+b+1.0;}

Just for record, I also tried

for( a = 0; a<2; a++)
    for (b = 0; b < 3; b++){
        sa[IDX2F(a, b)] = a+b+1.0;
        sb[IDX2F(a, b)] = a+b+1.0;}

where

#define IDX2C(i,j,ld) (((j)*(ld))+(i))

This gives me:

sa:

1.00 2.00 3.00
3.00 2.00 3.00

sb:

1.00 2.00
3.00 2.00
3.00 4.00

And then I allocate memory on GPU as follows:

    double *dsa, *dsb, *dsc;
    cudaMalloc((void **) &dsa, 6*sizeof(*sa));
    cudaMalloc((void **) &dsb, 6*sizeof(*sb));
    cudaMalloc((void **) &dsc, 4*sizeof(*sc));

    cublasSetMatrix(2, 3, sizeof(*sa), sa, 2, dsa, 2);
    cublasSetMatrix(3, 2, sizeof(*sb), sb, 3, dsb, 3);

    cublasDgemm(handle, CUBLAS_OP_N, CUBLAS_OP_N, 2, 2, 3, &alpha, dsa, 2, dsb, 3, &beta, dsc, 2);

    cublasGetMatrix(2, 2, sizeof(*sc), dsc, 2, sc, 2);

However, when I print the matrix Sc, I got

sc:

16.00 18.00
23.00 26.00

When it's supposed to be(according to matlab):

16.00 18.00
18.00 22.00

I'm not sure why I'm getting this wrong answer, would anyone spot a possible mistake I made? Thanks tons!

Upvotes: 0

Views: 242

Answers (1)

Robert Crovella
Robert Crovella

Reputation: 152164

I would recommend doing proper error checking on all your cublas calls.

The result you are getting is because cublas expects the matrices given to it to be in column-major order. The results you are expecting are correct for row-major matrices (and note what @pQB said about the error in your setup code.)

In addition, you can pass row-major ordered data directly to cublas and get sensible results, if you modify the calling setup.

Upvotes: 2

Related Questions