user2770254
user2770254

Reputation: 89

My output is assuming the whole file is one line

public static void main(String args[]) throws FileNotFoundException
{       
    String inputFileName = "textfile.txt";

    printFileStats(inputFileName);
}
public static void printFileStats(String fileName) throws FileNotFoundException
{
    String outputFileName = "outputtextfile.txt";
    File inputFile = new File(fileName);
    Scanner in = new Scanner(inputFile);
    PrintWriter out = new PrintWriter(outputFileName);

    int lines = 0;
    int words = 0;
    int characters = 0;

    while(in.hasNextLine())
    {               
        lines++;    
        while(in.hasNext())
        {
            in.next();
            words++;
        }   
    }

    out.println("Lines: " + lines);
    out.println("Words: " + words);
    out.println("Characters: " + characters);

    in.close();
    out.close();

}

I have a text file containing five lines

this is  
a text  
file  
full of stuff  
and lines  

The code creates an output file

Lines: 1  
Words: 10 
Characters: 0

However, if I remove the capability for reading the number of words in the file, it correctly states the number of lines (5). Why is this happening?

Upvotes: 1

Views: 121

Answers (4)

dognose
dognose

Reputation: 20889

The reason is, that hasNext() does not care about line breaks.

So, you are entering the while(in.hasNextLine()) loop, but then you are consuming the whole file with the while(in.hasNext()) loop, resulting in 1 line and 10 words.

-> Check the token consumed by hasNext() for EOL-Characters, then increase line count.

OR:

Use String line = scanner.nextLine() to obtain exactly ONE line, and then use a second scanner to fetch all tokens of that line: scanner2 = new Scanner(line); while(scanner2.hasNext())

Upvotes: 0

Josh
Josh

Reputation: 1553

Your inner while loop is gobbling up the whole file. You want to count the number of words in each line, right? Try this instead:

while (in.hasNextLine())
{               
    lines++;    
    String line = in.nextLine();
    for (String word : line.split("\\s")) 
    {
        words++;
    }   
}

Note that splitting on spaces is a very naive approach to tokenization (word-splitting) and will only work for simple examples like the one you have here.

Of course, you could also do words += line.split("\\s").length; instead of that inner loop.

Upvotes: 4

DaoWen
DaoWen

Reputation: 33019

in.hasNext() and in.next() treat all whitespace characters as word separators, including newline characters. Your inner loop is eating all the newlines as it's counting all the words.

Upvotes: 1

libik
libik

Reputation: 23029

This reads next Token, not the line :

in.next();

So it just read next and next and next and dont care about line ending. Space or \n is considered as white space usually, so methods like this one does not make any difference between them.

Upvotes: 0

Related Questions