Reputation: 42353
#include <fstream>
#include <string>
#include <cassert>
long long GetFileSizeA(const std::string& file_path)
{
return std::ifstream
{
file_path, std::ios::ate
}.tellg();
}
long long GetFileSizeB(const std::string& file_path)
{
return std::ifstream
{
file_path, std::ios::ate | std::ios::binary
}.tellg();
}
int main()
{
auto a = GetFileSizeA("~/test.log");
auto b = GetFileSizeB("~/test.log");
assert(a == b); // always true?
}
If the file ~/test.log
contains many \r\n
sequences, does the C++ standard guarantee GetFileSizeA
is identical to GetFileSizeB
?
Upvotes: 2
Views: 1061
Reputation: 32484
There is no such guarantee by the C++ standard.
In fact, the code
std::ifstream{file_path, std::ios::ate | std::ios::binary}.tellg();
is not guaranteed to work as intended, either. The tellg()
operation on file-based streams boils down through a couple of intermediate functions (std::basic_istream::tellg
-> std::basic_streambuf::pubseekoff
-> std::basic_filebuf::seekoff
) and using the 'as if ' formulation to std::fseek()
. The latter isn't required to support seeking in binary streams relative to the end position:
int fseek( std::FILE* stream, long offset, int origin );
Sets the file position indicator for the file stream stream.
If the stream is open in binary mode, the new position is exactly offset bytes measured from the beginning of the file if origin is
SEEK_SET
, from the current file position if origin isSEEK_CUR
, and from the end of the file if origin isSEEK_END
. Binary streams are not required to supportSEEK_END
, in particular if additional null bytes are output.
Upvotes: 1
Reputation: 129374
The standard does by no means guarantee that the two are equal (nor does the C or C++ standard state whether files contain \r\n
or \n
or \r
as the line-ending, that is defined by the OS and/or application. The standard C library, and by extension, the C++ library, guarantees that if you read the file in text-mode, it will transform whatever actual line-endings there are, into the internal \n
form). It also doesn't guarantee that it's NOT the same value always.
More importantly, you may well find that if you read some part of the file and ask "where am I", that the answer is different between if you read as a binary file or as an ascii file. If you plan on for example mapping the file into memory and processing it as a large string of characters, without translating newlines, then you need to do that as a binary file.
Upvotes: 1