james Chol
james Chol

Reputation: 53

Count number of sentences in a text file

Sentences I guess being string that end in ! ? .

Excepting thing like Dr. Mr. It is true that you cannot really know a sentence in java because of grammar.

But I guess what I mean is a period or exclamation mark or question mark and then what follows being a capital letter.

How would one do this.

This be what I have But its not working.....

      BufferedReader Compton = new BufferedReader(new FileReader(fileName));
        int sentenceCount=0;

        String violet;

        String limit="?!.";
        while(Compton.ready())
        {
            violet=Compton.readLine();

            for(int i=0; i<violet.length()-1;i++)
            {
                if(limit.indexOf(violet.charAt(i)) != -1 && i>0 && limit.indexOf(violet.charAt(i-1)) != -1)
                {
                    sentenceCount++;
                }
            }
        }
            System.out.println("the amount of sentence is " + sentenceCount);

EDIT New way that works better

          String violet;
        while(Compton.ready())
        {
            violet=Compton.readLine();
            sentenceCount=violet.split("[!?.:]+").length;
            System.out.println("the number of words in line is " + 

              sentenceCount);
         }

Upvotes: 1

Views: 8789

Answers (3)

sprinter
sprinter

Reputation: 27946

A potential way to do this is to scan your file as words and then count words that are not in your exception list that end in your given punctuation.

Here's a possible implementation using Java 8 streams:

List<String> exceptions = Arrays.toList("Dr.", "Mr.");
Iterable<String> iterableScanner = () -> new Scanner(filename);
int sentenceCount = StreamSupport.stream(iterableScanner, false)
    .filter(word -> word.matches(".*[\\.\\?!]))
    .filter(word -> !exceptions.contains(word))
    .count();

Upvotes: 0

Lev Kuznetsov
Lev Kuznetsov

Reputation: 3728

One liner:

int n = new String (Files.readAllBytes(Paths.get(path))).split ("[\\.\\?!]").length

Uses Java 7 constructs to read whole file to byte array, create a string from that and split into sentence array then gets the length of the array.

Upvotes: 1

javac
javac

Reputation: 2441

BufferedReader reader = new BufferedReader(new FileReader(fileName));
int sentenceCount = 0;
String line;
String delimiters = "?!.";

while ((line = reader.readLine()) != null) { // Continue reading until end of file is reached
    for (int i = 0; i < line.length(); i++) {
        if (delimiters.indexOf(line.charAt(i)) != -1) { // If the delimiters string contains the character
            sentenceCount++;
        }
    }
}

reader.close();
System.out.println("The number of sentences is " + sentenceCount);

Upvotes: 3

Related Questions