Zack
Zack

Reputation: 671

Tracking while reading a text file in Java

I'm trying to build a program with BufferedReader that reads a file and keeps track of vowels, words, and can calculate avg # of words per line. I have the skeleton in place to read the file, but I really don't know where to take it from here. Any help would be appreciated. Thanks.

import java.io.*;

public class JavaReader
{
    public static void main(String[] args) throws IOException
    {
        String line;
        BufferedReader in;

        in = new BufferedReader(new FileReader("message.txt"));

        line = in.readLine();

        while(line != null)
        {
            System.out.println(line);
            line = in.readLine();
        }


    }

}

Upvotes: 2

Views: 2165

Answers (2)

Michael Yaworski
Michael Yaworski

Reputation: 13483

Here's what I got. The word counting is questionable, but works for an example that I will give. Changes can be made (I accept criticism).

import java.io.*;

public class JavaReader
{
    public static void main(String[] args) throws IOException
    {
        BufferedReader in = new BufferedReader(new FileReader("message.txt"));
        String line = in.readLine();

        // for keeping track of the file content
        StringBuffer fileText = new StringBuffer();

        while(line != null) {
            fileText.append(line + "\n");
            line = in.readLine();
        }

        // put file content to a string, display it for a test
        String fileContent = fileText.toString();
        System.out.println(fileContent + "--------------------------------");

        int vowelCount = 0, lineCount = 0;

        // for every char in the file
        for (char ch : fileContent.toCharArray())
        {
            // if this char is a vowel
            if ("aeiou".indexOf(ch) > -1) {
                vowelCount++;
            }
            // if this char is a new line
            if (ch == '\n') {
                lineCount++;
            }
        }
        double wordCount = checkWordCount(fileContent);
        double avgWordCountPerLine = wordCount / lineCount;

        System.out.println("Vowel count: " + vowelCount);
        System.out.println("Line count: " + lineCount);
        System.out.println("Word count: " + wordCount);
        System.out.print("Average word count per line: "+avgWordCountPerLine);
    }

    public static int checkWordCount(String fileContent) {

        // split words by puncutation and whitespace
        String words[] = fileContent.split("[\\n .,;:&?]"); // array of words
        String punctutations = ".,:;";
        boolean isPunctuation = false;
        int wordCount = 0;

        // for every word in the word array
        for (String word : words) {

            // only check if it's a word if the word isn't whitespace
            if (!word.trim().isEmpty()) {
                // for every punctuation
                for (char punctuation : punctutations.toCharArray()) {

                    // if the trimmed word is just a punctuation
                    if (word.trim().equals(String.valueOf(punctuation)))
                    {
                        isPunctuation = true;
                    }
                }

                // only add one to wordCount if the word wasn't punctuation
                if (!isPunctuation) {
                    wordCount++;
                }
            }
        }
        return wordCount;
    }
}

Sample input/output:

File:

This is a test. How do you do?


This is still a test.Let's go,,count.

Output:

This is a test. How do you do?


This is still a test.Let's go,,count.
--------------------------------
Vowel count: 18
Line count: 4
Word count: 16
Average word count per line: 4.0

Upvotes: 1

Nic Robertson
Nic Robertson

Reputation: 1208

You can use a Scanner to pass over the the line and retrieve every token of the string line.

line = line.replaceAll("[^a-zA-Z]", ""); //remove all punctuation
line = line.toLowerCase();               //make line lower case
Scanner scan = new Scanner(line);
String word = scan.next();

Then you could loop through each token to calculate the vowels in each word.

for(int i = 0; i < word.legnth(); i++){
    //get char
    char c = word.charAt(i);
    //check if the char is a vowel here
    if("aeiou".indexOf(c) > -1){
        //c is vowel
    }
}   

All you need to do is set a couple of counter ints to keep track of these and you're laughing.

Ahh, if you want to make sure that there are no non-words such as " - " counting as a word, the easiest way would probably be to strip all non-alphanumeric characters out of the text. I also added it above.

 line = line.replaceAll("[^a-zA-Z]", "");
 line = line.toLowerCase();

Oh and since you are new to java don't forget to import

 import java.util.Scanner;

Upvotes: 1

Related Questions