Vitality
Vitality

Reputation: 21505

Compilation seems not to follow the correct path among different operator= overloads

I have two matrix classes, one for the CPU and one for the GPU, say Matrix and CudaMatrix, respectively. Declarations and definitions are in files .h, .cpp, .cuh and .cu. In the main, I have

Matrix<int2_>      foo1(1,2);
// Definition of the elements of foo1...
CudaMatrix<int2_>  foo2(1,2);

cout << typeid(foo1).name() << "\n";
cout << typeid(foo2).name() << "\n";

// Equality
foo2=foo1;

Now, I have no operator= overload between a CudaMatrix and a Matrix, but I have the following operator= overload

const CudaMatrix& operator=(const CudaMatrix<LibraryNameSpace::int2_>&);

between two CudaMatrix. What happens is the following:

  1. The two typeid's return the correct classes for foo1 and foo2;
  2. The above operator= overload is compiled and invocated at runtime for the foo2=foo1 assignment. I would have expected, on the contrary, a compilation error;
  3. The result of the assignment leads to a correct result for foo2!

I'm using Visual Studio 2010 and compiling in release mode.

Anyone has some hints on why this apparently illogic behavior occurs?

Thanks.

Upvotes: 1

Views: 80

Answers (1)

talonmies
talonmies

Reputation: 72342

They key to why this works is because you have both a copy constructor and an explicit copy assignment operator. These two things together are what make a seemingly undefined case function correctly. So when you do this:

Matrix<int2_>      foo1(1,2);
CudaMatrix<int2_>  foo2(1,2);

foo2 = foo1;

what happens is the equivalent of this:

Matrix<int2_>      foo1(1,2);
CudaMatrix<int2_>  foo2(1,2);

// foo2 = foo1;
{   
    CudaMatrix<int2_> x(foo1); // copy constructor
    foo2 = x; // Copy assignment
}

Note that there are device memory usage implications you should be aware of at play here (ie. two device memory allocations and two sets of whatever API calls you have under the hood).

It is worth pointing out this is not CUDA specific, it is a standard feature of the C++98 object model. You might benefit from revising the rule of three if you want to learn more about how and why this works (and why seemingly analogous counter examples won't work).

Upvotes: 2

Related Questions