Reputation: 15996
I'm have some text parsing that I'd like to behave identically whether read from a file or from a stringstream. As such, I'm trying to use an std::istream
to perform all the work. In the string version, I'm trying to get it to read from a static memory byte array I've created (which was originally from a text file). Let's say the original file looked like this:
4
The corresponding byte array is this:
const char byte_array[] = { 52, 13, 10 };
Where 52 is ASCII for the character 4, then the carriage return, then the linefeed.
When I read directly from the file, the parsing works fine.
When I try to read it in "string mode" like this:
std::istringstream iss(byte_array);
std::istream& is = iss;
I end up getting the carriage returns stuck on the end of the strings I retrieve from the stringstream with this method:
std::string line;
std::getline(is, line);
This screws up my parsing because the string.empty()
method no longer gets triggered on "blank" lines -- every line contains at least a 13
for the carriage return even if it's empty in the original file that generated the binary data.
Why is the ifstream
behaving differently from the istringstream
in this respect? How can I have the istringstream
version discard the carriage return just like the ifstream
version does?
Upvotes: 3
Views: 1985
Reputation: 169018
std::ifstream
operates in text mode by default, which means it will convert non-LF line endings to a single LF. In this case, std::ifstream
is removing the CR character before std::getline()
ever sees it.
std::istringstream
does not do any interpretation of the source string, and passes through all bytes as they are in the string.
It's important to note that std::string
represents a sequence of bytes, not characters. Typically one uses std::string
to store ASCII-encoded text, but they can also be used to store arbitrary binary data. The assumption is that if you have read text from a file into memory, you have already done any text transformations such as standardization of line endings.
The correct course of action here would be to convert line endings when the file is being read. In this case, it looks like you are generating code from a file. The program that reads the file and converts it to code should be eliminating the CR characters.
An alternative approach would be to write a stream wrapper that takes an std::istream
and delegates read operations to it, converting line endings on the fly. This approach is viable, though can be tricky to get right. (Efficiently handling seeking, in particular, will be difficult.)
Upvotes: 2