Levy
Levy

Reputation: 36

Count number of words from file converted to string

I am trying to count the number of words in a file passed through a string. I am also displaying the string to make sure the output is correct and I am getting the exact contents of the file.

However, my word count method counts the last word of the previous line and the first word of the next line as one word.

Example: "Test word (newline) test words" outputs as "Test wordtest words"

Tried adding "\n" to my code and it displays correct output now but still counts it as before.

Any help would be appreciated.

Upvotes: 0

Views: 151

Answers (5)

Master Azazel
Master Azazel

Reputation: 600

Why dont you just

String sentence = "This is a sentence.";
String[] words = sentence.split(" ");
System.out.println(words.length);

split your string at the " " and count the words.

Upvotes: 0

Joe
Joe

Reputation: 320

Here's the reason why "Test word (newline) test words" outputs as "Test wordtest words"

in.nextLine() returns the line as a String excluding the newline character at end of the line. See https://docs.oracle.com/javase/8/docs/api/java/util/Scanner.html#nextLine--

It would be more efficient though to keep track of the word count instead of appending the lines to a String and then counting at the end. The pseudocode would be something like this:

int wordCount = 0
while (file has more lines) {
    line = line.trim()
    int wordsOnLine = numberOfSpacesPlusOne(line)
    wordCount += wordsOnLine
}

Upvotes: 0

Eddie Martinez
Eddie Martinez

Reputation: 13910

You can also count using regular expressions.

public static int countWords(String line) {

    Pattern pattern = Pattern.compile("\\w+");
    Matcher  matcher = pattern.matcher(line);

    int count = 0;
    while (matcher.find())
        count++;

    return count;

    }

Upvotes: 0

Thiago Gama
Thiago Gama

Reputation: 150

 /* * Counting number of words using regular expression. */
public int countWord(String word) {
    return word.isEmpty() ? 0 : word.split("\\s+").length;
}

Upvotes: 2

You can change the condition that checks for spaces to include new line too

if ((line.charAt(i) == ' ' || line.charAt(i) == '\n') && line.charAt(i + 1) != ' ')

Upvotes: 2

Related Questions