Reputation: 45
Arrays are stored as xyzxyz...
, I want to get the maximum and minimum for some direction(x or y or z), and here is the test program:
#include <cuda_runtime.h>
#include <cuda_runtime_api.h> // cudaMalloc, cudaMemcpy, etc.
#include <cublas_v2.h>
#include <helper_functions.h> // shared functions common to CUDA Samples
#include <helper_cuda.h> // CUDA error checking
#include <stdio.h> // printf
#include <iostream>
template <typename T>
void print_arr(T *arr, int L)
{
for (int i = 0; i < L; i++)
{
std::cout << arr[i] << " ";
}
std::cout << std::endl;
}
int main()
{
float hV[10] = {3, 0, 7, 1, 2, 8, 6, 7, 6, 4};
print_arr(hV, 10);
float *dV;
cudaMalloc(&dV, sizeof(float) * 10);
cudaMemcpy(dV, hV, sizeof(float) * 10, cudaMemcpyHostToDevice);
cublasHandle_t cublasHandle = NULL;
checkCudaErrors(cublasCreate(&cublasHandle));
int hResult[2] = {0};
checkCudaErrors(cublasIsamax(cublasHandle, 10, dV, 3, hResult + 0));
checkCudaErrors(cublasIsamin(cublasHandle, 10, dV, 3, hResult + 1));
print_arr(hResult, 2);
return 0;
}
expected result:
3 0 7 1 2 8 6 7 6 4
3 2
result:
3 0 7 1 2 8 6 7 6 4
3 5
Is there a problem with this result? Or I misunderstood?
link to cublasIsamin
.
Upvotes: 1
Views: 166
Reputation: 151889
cublasIsamin
finds the index of the minimum value. This index is not computed over the original array, but also takes the incx
parameter into account. Furthermore, it will search over n
elements (the first parameter) regardless of other parameters such as incx
.
You have an array like this:
index: 0 1 2 3 4 5 6 7 8 9
x/y/z: x y z x y z x y z x
value: 3 0 7 1 2 8 6 7 6 4
x index: 1 2 3 4
Therefore the minimum x
value is at index 3, searching over a total of n=4
(not 10) elements. With respect to the x values, we must begin searching dV
at offset 0 with an increment of 3, for a maximum of n=4
elements.
Taking all this into account, the correct calls are:
cublasIsamax(cublasHandle, 4, dV, 3, hResult + 0));
cublasIsamin(cublasHandle, 4, dV, 3, hResult + 1));
And the expected result is:
3 2
Upvotes: 2