Reputation: 761
I was trying out the library function Cublas_Dgemm() on a small matrix, but it's not giving me what I expect.
So I declare and initialize the matrix in the following way:
double alpha = 1.0, beta = 0.0;
double * sa = (double *)malloc(6*sizeof(double));
double * sb = (double *)malloc(6*sizeof(double));
double * sc = (double *)malloc(4*sizeof(double));
for( a = 0; a<2; a++)
for (b = 0; b < 3; b++){
sa[a*3+b] = a+b+1.0;
sb[a*3+b] = a+b+1.0;}
Just for record, I also tried
for( a = 0; a<2; a++)
for (b = 0; b < 3; b++){
sa[IDX2F(a, b)] = a+b+1.0;
sb[IDX2F(a, b)] = a+b+1.0;}
where
#define IDX2C(i,j,ld) (((j)*(ld))+(i))
This gives me:
sa:
1.00 2.00 3.00
3.00 2.00 3.00
sb:
1.00 2.00
3.00 2.00
3.00 4.00
And then I allocate memory on GPU as follows:
double *dsa, *dsb, *dsc;
cudaMalloc((void **) &dsa, 6*sizeof(*sa));
cudaMalloc((void **) &dsb, 6*sizeof(*sb));
cudaMalloc((void **) &dsc, 4*sizeof(*sc));
cublasSetMatrix(2, 3, sizeof(*sa), sa, 2, dsa, 2);
cublasSetMatrix(3, 2, sizeof(*sb), sb, 3, dsb, 3);
cublasDgemm(handle, CUBLAS_OP_N, CUBLAS_OP_N, 2, 2, 3, &alpha, dsa, 2, dsb, 3, &beta, dsc, 2);
cublasGetMatrix(2, 2, sizeof(*sc), dsc, 2, sc, 2);
However, when I print the matrix Sc, I got
sc:
16.00 18.00
23.00 26.00
When it's supposed to be(according to matlab):
16.00 18.00
18.00 22.00
I'm not sure why I'm getting this wrong answer, would anyone spot a possible mistake I made? Thanks tons!
Upvotes: 0
Views: 242
Reputation: 152164
I would recommend doing proper error checking on all your cublas calls.
The result you are getting is because cublas expects the matrices given to it to be in column-major order. The results you are expecting are correct for row-major matrices (and note what @pQB said about the error in your setup code.)
In addition, you can pass row-major ordered data directly to cublas and get sensible results, if you modify the calling setup.
Upvotes: 2