Reputation: 2570
I need to read a binary file (in one go) which contains a header and data. There are different ways to read a file in C++ and I would like to know which one is the fastest and more reliable. I also don't know if reintrerpret_cast
is the best way to turn raw data into a structure.
EDIT: The header structure doesn't have any functions, only data.
ifstream File(Filename, ios::binary); // Opens file
if (!File) // Stops if an error occured
{
/* ... */
}
File.seekg(0, ios::end);
size_t Size = File.tellg(); // Get size
File.seekg(0, ios::beg);
This is ifstream WITHOUT istreambuf_iterator
char* Data = new char[Size];
File.read(Data, Size);
File.close();
HeaderType *header = reinterpret_cast<HeaderType*>(Data);
/* ... */
delete[] Data;
This is ifstream WITH istreambuf_iterator
std::string Data; // Is it better to use another container type?
Data.reserve(Size);
std::copy((std::istreambuf_iterator<char>(File)), std::istreambuf_iterator<char>(),
std::back_inserter(Data));
File.close();
const HeaderType *header = reinterpret_cast<HeaderType*>(Data.data());
Also found this in the Internet
std::ostringstream Data;
Data << File.rdbuf();
File.close();
std::string String = Data.str();
const HeaderType *header = reinterpret_cast<HeaderType*>(String.data());
Upvotes: 1
Views: 711
Reputation: 153929
First, none of the solutions you describe will actually work; the
reinterpret_cast
should tell you that. At some point, you'll have to
parse the bytes in the buffer, and insert the extracted data field by
field into your internal data structures.
As for getting the bytes into the buffer as quickly as possible, the
less extra work you do, the better. The fastest way would be to either
use low level IO (open
and then read
under Unix), or even map the
file into memory (mmap
under Unix). Of course, this is system
dependent; if you want to use ifstream
in order to achieve system
independence, then using istream::read
is certainly the fastest (and
the most logical, all things considered). Just be sure that the stream
is imbued with the "C"
locale, as well as being opened in binary mode.
For the record: using the system level functions will transfer the data
directly from the OS into your buffer. istream::read
will copy from
an interal buffer in the filebuf
into your buffer (and use the system
level functions to get the data into your buffer). The other two will
build an std::string
object, byte by byte, allocating memory as
needed, since the final length won't be known.
And finally, rather than new char[size]
, use an std::vector<char>
.
Upvotes: 0
Reputation: 206607
Reading the contents of the file into a char*
and then performing reinterpret_cast
to HeaderType*
is not a good idea.
From the standard:
5.2.10 Reinterpret cast
...
7 An object pointer can be explicitly converted to an object pointer of a different type70. When a prvalue
v
of type “pointer toT1
” is converted to the type “pointer to cvT2
”, the result isstatic_cast<cv T2*>(static_cast<cv void*>(v))
if bothT1
andT2
are standard-layout types (3.9) and the alignment requirements ofT2
are no stricter than those ofT1
, or if either type isvoid
. Converting a prvalue of type “pointer toT1
” to the type “pointer toT2
” (whereT1
andT2
are object types and where the alignment requirements ofT2
are no stricter than those ofT1
) and back to its original type yields the original pointer value. The result of any other such pointer conversion is unspecified.
In your case, if alignment requirements of HeaderType
are stricter than char
, you will run into undefined behavior.
If you have the choice, I would suggest.
Read the header first.
HeaderType header;
File.read(reinterpret_cast<char*>(&header), sizeof(HeaderType));
Read the rest of the data based on the value of header
.
Upvotes: 3
Reputation: 48625
This is going to be 'opinion based' and as such is not strictly on-tpoic for SO.
However I don't see the point in using iterators in this case as the read()
function is more succinct.
However, more importantly, the way you are doing this breaks strict aliasing rules because the alignment in memory of your struct
is not guaranteed to be in line with a char
array.
It is always best to cast the address of the struct
to a char*
not the other way round:
HeaderType header;
File.read(reinterpret_cast<char*>(&header), sizeof(header));
File.close();
Reading data in binary like this is not portable and won't work for complex user-defined types (like std::string
) so it is preferred to serialize all the data members as a formatted string.
NOTE: See docs for reinterpret_cast for information on type aliasing.
Upvotes: 1