grayasm
grayasm

Reputation: 993

Carriage return as line ending in c++ file

I have been reading the ISO 14882:2003. It says:

s-char:
any member of the source character set except the double-quote ", backslash \, or new-line character escape-sequence
universal-character-name

Now, about new-line character I see a problem when the line ending is '\r'
I wrote a small cpp program:

#include <fstream>
#include <string>
int main()
{
    const char* program=""
        "#include <string>\n"
        "int main()\n"
        "{\n"
        "  std::string s;\n"
        "  //s=\"\r"
        "  //\r"
        "  //\r"
        "  //\r"
        "  //\";\n"
        "  s=\"\\xAE\\xfffactory\\xAE\\xffaction\";\n"
        "  return 0;\n"
        "}\n"
        ;
    std::ofstream file("file.cpp", std::ios_base::trunc);
    file << program;
    file.close();
    return 0;
}

On Windows, file.cpp (as read in VS editor) is:

#include <string>
int main()
{
  std::string s;
  //s="
  //
  //
  //
  //";
  s="\xAE\xfffactory\xAE\xffaction";
  return 0;
}

When compiling file.cpp, VS triggers and error in line 6, instead of line 10.

On Linux, file.cpp (as read in emacs) is:

#include <string>
int main()
{
  std::string s;
  //s="^M  //^M  //^M  //^M  //";
  s="\xAE\xfffactory\xAE\xffaction";
  return 0;
}

Compiling file.cpp with gcc I get an error in line 10, not in line 6.

What should I conclude from this?

Upvotes: 0

Views: 5557

Answers (4)

Soren
Soren

Reputation: 14718

Now, about new-line character I see a problem when the line ending is '\r'...

'\r' is a carriage return and not a newline -- so Im not sure what the problems is?

Windows chose to make some magic of representing \r as newlines, but that does not mean that they actually are newlines

Upvotes: 0

John Bartholomew
John Bartholomew

Reputation: 6616

Section 2.1 [lex.phases]. The first phase of translation is:

Physical source file characters are mapped, in an implementation-defined manner, to the basic source character set (introducing new-line characters for end-of-line indicators) if necessary. ...

In other words, the implementation is free to use whatever line ending convention it wants, and turn that into newline characters during the first phase of translation.

Practically speaking, you should be safe using the newline character for line endings on any modern compiler.

Upvotes: 1

Yakov Galka
Yakov Galka

Reputation: 72549

You should conclude that:

  1. VS editor understands any line-endings and so displays it as multiple lines (well, this is a known feature).
  2. MSVC compiler doesn't understand \r line-endings, so it actually counts the "; line as the 6th line.
  3. emacs doesn't understand \r line-endings (at least by default) so it shows you the source in a single line.
  4. GCC understands any line endings, so it doesn't loose the count.

Ah, also the quote you provided from the standard is unrelated. The new-line there refers to the source character set, not the \r and \n in strings. The grammar rule you quoted just excludes string literal such as:

const char* s = "some text, here comes 'new-line'
    ha ha ";

Upvotes: 7

SingleNegationElimination
SingleNegationElimination

Reputation: 156308

Windows and linux use different line ending conventions. On linux, the end of line is 0x0A, and on windows its 0x0D, 0x0A. C/C++ programs are themselves text files, and are often interoperable across platforms, so long as you conform to the text conventions on the platform .

the dos2unix(1) tool is purpose build for just this task.

Alternatively, since you're producing this code dynamically in your own tool, you could provide an option that tells it to use one line-ending style or the other.

Upvotes: 1

Related Questions