Vaibhav Agarwal
Vaibhav Agarwal

Reputation: 1144

carriage return by fgets

I am running the following code:

#include<stdio.h>
#include<string.h>
#include<io.h>

int main(){
    FILE *fp;
    if((fp=fopen("test.txt","r"))==NULL){
        printf("File can't be read\n");
        exit(1);
    }
    char str[50];
    fgets(str,50,fp);
    printf("%s",str);
    return 0;
}

text.txt contains: I am a boy\r\n

Since I am on Windows, it takes \r\n as a new line character and so if I read this from a file it should store "I am a boy\n\0" in str, but I get "I am a boy\r\n". I am using mingw compiler.

Upvotes: 9

Views: 13759

Answers (3)

planB
planB

Reputation: 19

The c standard says this about text streams in (among other things):

Characters may have to be added, altered, or deleted on input and output to conform to differing conventions for representing text in the host environment. Thus, there need not be a one-to-one correspondence between the characters in a stream and those in the external representation. Data read in from a text stream will necessarily compare equal to the data that were earlier written out to that stream only if: the data consist only of printing characters and the control characters horizontal tab and new-line; no new-line character is immediately preceded by space characters; and the last character is a new-line character.

In other words, if a file is opened in text mode, an implementation is free to add, remove and modify control characters if it wants/needs to when going to and from disk. Which is apparently what the microsoft implementation does with the carriage return, but the gnu implementation doesn't.

Upvotes: 2

Eduard Wirch
Eduard Wirch

Reputation: 9922

The behavior depends on the c library implementation and which mode you pass to fopen. See this quote from the MSDN documentation on fopen (fopen on MSDN):

b - Open in binary (untranslated) mode; translations involving carriage-return and linefeed characters are suppressed.

Means, if you use the Microsoft c library, and open your file omitting the 'b', the carriage return characters will be removed from the stream.

Since you're using mingw, your compiler probably links against the GNU c library which follows the POSIX standard. This is what the GNU documentation says about fopen (fopen on gnu.org):

The character ‘b’ in opentype has a standard meaning; it requests a binary stream rather than a text stream. But this makes no difference in POSIX systems (including GNU systems).

Concluding: you're omitting the 'b' mode char, which opens your stream in text mode. You're on Windows but use a GNU c library which makes no difference between text and binary mode. This is why fgets reads both carriage return and new line.

Upvotes: 10

netcoder
netcoder

Reputation: 67695

Since I am on Windows, it takes \r\n as a new line character...

This assumption is wrong. The C standard treats carriage return and new line as two different things, as evidenced in C99 §5.2.1/3 (Character sets):

[...] In the basic execution character set, there shall be control characters representing alert, backspace, carriage return, and new-line. [...]

The fgets function description is as follows, in C99 §7.19.7.2/2:

The fgets function reads at most one less than the number of characters specified by n from the stream pointed to by stream into the array pointed to by s. No additional characters are read after a new-line character (which is retained) or after end-of-file. A null character is written immediately after the last character read into the array.

Therefore, when encountering the string I am a boy\r\n, a conforming implementation should read up to the \n character. There is no possibly sane reason why the implementation should discard \r based on the platform.

Upvotes: 6

Related Questions