Reputation: 4382
I wrote this code to count the number of characters in a text file :
sr.BaseStream.Position = 0;
sr.DiscardBufferedData();
int Ccount = 0;
while (sr.Peek() != -1)
{
sr.Read();
Ccount++;
}
but after applying this code to a file contains :
1
2
3
4
5
6
7
8
9
0
Ccount = 30 ???? why? I am using Windows Xp on virtual box on my Macbook the program used : Microsoft Visual Basic 2010.
Upvotes: 10
Views: 58366
Reputation: 1601
There's an easier way to do this. Make the entire *.txt file to a string array and measure it:
int count = 0;
string[] Text = File.ReadAllLines(/*Path to the file here*/);
for (int i = 0; i < Text.Count(); i++)
{
count += Text[i].Length;
}
Upvotes: 3
Reputation: 12837
The new line is actually 2 separate characters: LF CR (line feed and carriage return). But you would know that if you put a breakpoint in your loop. Now for extra credit, how many bytes that is in unicode?
Upvotes: 2
Reputation: 54811
Windows typically uses \r\n
for new line, that is ASCII characters 0x13 and 0x10.
Suggest you prove this to yourself by doing this:
Console.WriteLine("0x{0:x}", sr.Read());
Upvotes: 2
Reputation: 16296
In Windows each new line consists of two characters \r
and \n
. You have 10 lines, each line have 1 visible characters and 2 new line characters which add up to 30 characters.
If you have created your file in Mac or Unix/Linux you would have gotton different result (20 characters). Because Unix uses only \n
and Mac uses only \r
for a new line.
You can use some editors (such as Notepad++) to show you new line characters, or even switch between different modes (DOS/Unix/Mac).
Upvotes: 15
Reputation: 281835
You're reading one character at a time, and each line contains three characters:
\r
)\n
)(Windows uses \r\n
as its newline sequence. The fact that you're running in a VM on a Mac doesn't affect that.)
Upvotes: 15