Reputation: 21505
I have both CPU and GPU (CUDA) matrix classes and I want to overload the operator()
so that I can read or write individual elements of the matrices.
For the CPU matrix class, I was able to do so by
OutType & operator()(const int i) { return data_[i]; }
(read) and
OutType operator()(const int i) const { return data_[i]; }
(write). For the GPU matrix class, I was able to overload the operator()
for reading by
__host__ OutType operator()(const int i) const { OutType d; CudaSafeCall(cudaMemcpy(&d,data_+i,sizeof(OutType),cudaMemcpyDeviceToHost)); return d; }
but I was unable to do the same for writing. Could someone provide any hint to solve this issue?
The writing case for CPU returns the reference of data_[i]
, so the assignment job is performed by the builting C++ operator=
. I cannot figure out how could I exploit the same mechanism for CUDA.
Thanks.
Upvotes: 0
Views: 1090
Reputation: 6420
You can create a separate class that has overloaded assignment operator and type cast operator and emulates reference behavior:
class DeviceReferenceWrapper
{
public:
explicit DeviceReferenceWrapper(void* ptr) : ptr_(ptr) {}
DeviceReferenceWrapper& operator =(int val)
{
cudaMemcpy(ptr_, &val, sizeof(int), cudaMemcpyHostToDevice);
return *this;
}
operator int() const
{
int val;
cudaMemcpy(&val, ptr_, sizeof(int), cudaMemcpyDeviceToHost);
return val;
}
private:
void* ptr_;
};
and use it in matrix class
class Matrix
{
DeviceReferenceWrapper operator ()(int i)
{
return DeviceReferenceWrapper(data + i);
}
};
Upvotes: 1