Reputation: 2558
Let's consider the following (simplified) code for reading contents of a binary file:
struct Header
{
char signature[8];
uint32_t version;
uint32_t numberOfSomeChunks;
uint32_t numberOfSomeOtherChunks;
};
void readFile(std::istream& stream)
{
// find total size of the file, in bytes:
stream.seekg(0, std::ios::end);
const std::size_t totalSize = stream.tellg();
// allocate enough memory and read entire file
std::unique_ptr<std::byte[]> fileBuf = std::make_unique<std::byte[]>(totalSize);
stream.seekg(0);
stream.read(reinterpret_cast<char*>(fileBuf.get()), totalSize);
// get the header and do something with it:
const Header* hdr = reinterpret_cast<const Header*>(fileBuf.get());
if(hdr->version != expectedVersion) // <- Potential UB?
{
// report the error
}
// and so on...
}
The way I see this, the following line:
if(hdr->version != expectedVersion) // <- Potential UB?
contains undefined behavior: we're reading version
member of type uint32_t
which is overlaid on top of an array of std::byte
objects, and compiler is free to assume that uint32_t
object does not alias anything else.
The question is: is my interpretation correct? If yes, what can be done to fix this code? If no, why there's no UB here?
Note 1: I understand the purpose of the strict aliasing rule (allowing compiler to avoid unnecessary loads from memory). Also, I know that in this case using std::memcpy
would be a safe solution - but using std::memcpy
would mean that we have to do additional memory allocations (on stack, or on heap if size of an object is not known).
Upvotes: 1
Views: 358
Reputation: 3569
what can be done to fix this code?
Wait until http://wg21.link/P0593 or something similar allowing implicit object creation in arrays of char
/unsigned char
/std::byte
is accepted.
Upvotes: 0
Reputation: 238361
The question is: is my interpretation correct?
Yes.
If yes, what can be done to fix this code?
You already know that memcpy is a solution. You can however skip memcpy and extra memory allocation by reading directly onto the header object:
Header h;
stream.read(reinterpret_cast<char*>(&h), sizeof h);
Note that reading binary file this way means that the integer representation of the file must match the representation of the CPU. This means that the file is not portable to systems with differing CPU architecture.
Upvotes: 3