Reputation: 2316
I'm writing a parser, and I was previously having trouble when I try to parse identifiers (anything that's valid for a C++ variable name) and unclosed string literals (anything starting with "
, but missing the closing "
) at the end of my input. I think it's because the lexer (TokenStream
) uses std::noskipws
in these cases and builds the token character by character. Here is where I believe I have narrowed down the problem (shown only for one of the two cases, as the other is very similar logic):
std::string TokenStream::get()
{
char c;
(*input) >> c; // input is of type istream*
// other cases...
if (c == '"')
{
std::string s = stringFromChar(c); // just makes a string from the char.
char d;
while (true) // 1)
{
(*input) >> std::noskipws >> d;
std::cout << d; // 2)
if (d == '"')
{
s += d;
(*input) >> std::skipws;
break;
}
s += d;
}
return s;
}
// other cases...
}
Note that this function is supposed to just generate tokens from the input in a stream-like fashion. Now, if I input either a literal (like asdf
) or an unclosed string (like "asdf
), then the program will hang, and the line marked 2)
will just output the last character of the input (in my examples, f
) over and over again forever.
I've solved this problem by using a check for input->eof()
, but my question is this:
Why does the loop (marked 1)
in comments) keep executing when I hit the end of stream, and why does it just print that last character read every time through the loop?
Upvotes: 0
Views: 212
Reputation: 5845
Lets look at the loop in question line-by-line
while (true) // 1)
That's gonna loop, unless a break is encountered
{
(*input) >> std::noskipws >> d;
Read a character. If can't read character, d
is likely to be unchanged.
std::cout << d; // 2)
Print the character that is just read
if (d == '"')
Nope, the last character was not "
(as specified in the question)
{
s += d;
(*input) >> std::skipws;
break;
}
s += d;
}
Therefore the break is never encountered and the last character is printed in an endless loop.
Fix: always use a while look like this for input:
char ch;
while (input >> ch) {
// ch contains a new letter, deal with it
}
Upvotes: 1