Reputation: 23
I've got a slight problem. It appears that for some reason my function, when counting the size of a .txt file, counts a newline as it was two chars instead of one. Here's the function:
#define IN_FILE "in_mat.txt"
#define IN_BUF
#ifdef IN_BUF
void inBuf(char *(&b)){
streampos size;
ifstream f(IN_FILE, ios::in);
f.seekg(0,ios::end);
size=f.tellg();
b=new char[size];
f.seekg(0, ios::beg);
f.read(b, size);
f.close();
}
#endif
And here's the read file:
2 2
1 0
0 1
2 2
i 0
0 -i
2 2
0 1
-1 0
2 2
0 i
i 0
Earlier, i've put some couts, and it appears, that size=60, while the actual size is 49 (checked it), and the count of newlines in the file is 11, so exactly 60-49. Could somebody help me with that?
Upvotes: 2
Views: 559
Reputation: 35440
To add to the other answers, if you want to read special characters such as newline characters, you should open your file in binary
mode, not text mode.
ifstream f(IN_FILE, ios::in | ios::binary);
If you don't open the file in binary mode, the actual characters that make up the '\n'
are translated by the runtime to a single character (namely '\n'
). So in text mode, you don't get the "real" version of the file in terms of all of the actual characters that the file consists of.
In addition, functions such as seekg()
and tellg()
will not work as expected with a file opened in text mode, or at the very least, will give you "wrong results" (actually not wrong to the functions themselves, but wrong if you're writing a program that tries to "hone in" on a position within the file). Again, the newline (and EOF) translation that is done under the hood by the runtime gets in the way of these functions working as you would expect them to.
On the other hand, a file opened in binary mode allows these functions to work as expected -- no translation of newline, or EOF -- whatever the individual bytes that makes up the file contents are, that is what you get.
The next thing you need to determine is whether it is a Unix text file or a Windows text file. Depending on which one it is, the line endings will be different.
Upvotes: 1
Reputation: 76
I am assuming that you are running on Windows. If not, disregard my answer below.
Windows stores new line characters in text files as two characters (CR LF or '\r' '\n'). So, seeking to the end of the file and calling tellg() will return the binary size of the file (60), not the text size (49).
In order to get the correct text size (49), one solution would be to count each new line character (11) and subtract that number from the total byte size.
Upvotes: 0
Reputation: 588
Windows uses "\r\n" to return to the beginning of the line ('\r') and begin a new one ('\n').
To remove them from your count you have to read the whole file and count the number of '\r's.
Upvotes: 1
Reputation: 585
Windows stores newlines as two characters: '\r\n', known as carriage return and line feed. That's why it's counted twice: there are actually two characters to be counted.
Upvotes: 0