user557240
user557240

Reputation: 273

Conflicting character counts

I'm trying to find the number of characters in a given text file.

I've tried using both a scanner and a BufferedReader, but I get conflicting results. With the use of a scanner I concatenate every line after I append a new line character. E.g. like this:

    FileReader reader = new FileReader("sampleFile.txt");
    Scanner lineScanner = new Scanner(reader);
    String totalLines = "";

    while (lineScanner.hasNextLine()){
        String line = lineScanner.nextLine()+'\n';
        totalLines += line;
    }
    System.out.println("Count "+totalLines.length());

This returns the true character count for my file, which is 5799

Whereas when I use:

 BufferedReader reader = new BufferedReader(new FileReader("sample.txt"));

 int i;
 int count = 0;
 while ((i = in.read()) != -1) {
    count++;
 }

 System.out.println("Count "+count);

I get 5892.

I know using the lineScanner will be off by one if there is only one line, but for my text file I get the correct ouput.

Also in notepad++ the file length in bytes is 5892 but the character count without blanks is 5706.

Upvotes: 4

Views: 93

Answers (2)

Mechkov
Mechkov

Reputation: 4324

You have to consider the newline/carriage returns character in a text file. This also counts as a character.

I would suggest using the BufferedReader as it will return more accurate results.

Upvotes: 1

Micah Hainline
Micah Hainline

Reputation: 14427

Your file may have lines terminated with \r\n rather than \n. That could cause your discrepancy.

Upvotes: 2

Related Questions