Reputation: 31
I have a member variable, m_width
, that won't hold any value assigned to it. The relevant part of the class looks as follows:
class GPUFrame
{
private:
std::shared_ptr<void> m_deviceData;
unsigned m_pitch = 0;
unsigned m_width = 2; // testing if m_width will take any value
unsigned m_height = 0;
unsigned m_timestamp = 0; // time value in microseconds (absolute value is arbitrary)
bool m_endOfStream = false; // signifies last frame in the stream
public:
// make an entirely new allocation
GPUFrame(unsigned imageWidth, unsigned imageHeight, unsigned allocationCols, unsigned allocationRows,
unsigned timestamp, bool eos=false)
{
// initializer list was causing headaches
m_pitch = 0;
m_width = imageWidth;
m_height = imageHeight;
m_timestamp = timestamp;
m_endOfStream = eos;
// get space from CUDA
void* newAllocation;
cudaErr(cudaMallocPitch(&newAllocation, reinterpret_cast<size_t*>(&m_pitch), static_cast<size_t>(allocationCols), static_cast<size_t>(allocationRows)));
// track allocation with the shared_ptr
m_deviceData = std::shared_ptr<void>(newAllocation, [=](void* p){ cudaErrNE(cudaFree(p)); });
std::cout << "imageWidth = " << imageWidth << ", m_width = " << m_width << std::endl;
}
// copy from given location
GPUFrame(CUdeviceptr devPtr, unsigned pitch,
unsigned imageWidth, unsigned imageHeight, unsigned allocationCols, unsigned allocationRows,
unsigned timestamp, bool eos=false): GPUFrame(imageWidth, imageHeight, allocationCols, allocationRows, timestamp)
{
// copy into a more permanent chunk of memory allocated by above ctor
cudaErr(cudaMemcpy2D(data(), m_pitch, reinterpret_cast<void*>(devPtr), pitch, allocationCols, allocationRows, cudaMemcpyDeviceToDevice));
}
}
The output I keep getting:
imageWidth = 1920, m_width = 0
I'm confused why m_width
would even be 0, that doesn't even seem like an option. Does anyone have any clue as to what I'm doing wrong?
FWIW, I'm compiling with g++-5
using the --std=gnu++11
option. The full code is available at https://github.com/briantilley/computer-vision.
Upvotes: 1
Views: 112
Reputation: 1613
This:
cudaMallocPitch(&newAllocation, reinterpret_cast<size_t*>(&m_pitch), static_cast<size_t>(allocationCols), static_cast<size_t>(allocationRows));
Will cause the pitch to be written to m_pitch. This expects to write a size_t's worth of data to &m_pitch.
m_pitch is declared as "unsigned". This is not necessarily the same size as a size_t. If you output:
std::cout << sizeof(unsigned) << "\n" << sizeof(size_t)
Then I expect that you'll see "4" and "8".
So cudaMallocPitch will write 8 bytes, starting at &m_pitch. This will overwrite the next field, which is m_width.
Changing the type of m_pitch to size_t should solve this. You should also be able to remove the reinterpret_cast as well.
Upvotes: 4
Reputation: 137330
unsigned
is typically 4 bytes. std::size_t
, on a 64-bit system, is typically 8 bytes.1
cudaMallocPitch
writes to the thing pointed to by its second argument assuming that it's a size_t
, but m_pitch
is an unsigned
, so it ends up clobbering m_width
.
1 Usual disclaimers about unicorn systems apply.
Upvotes: 2