Reputation: 445
I am trying to split a paragraph of text into separate sentences based on punctuation marks i.e. [.?!] However, the scanner splits the lines at the end of each new line as well, even though I've specified a particular pattern. How do I resolve this? Thanks!
this is a text file. yes the
deliminator works
no it does not. why not?
Scanner scanner = new Scanner(fileInputStream);
scanner.useDelimiter("[.?!]");
while (scanner.hasNext()) {
line = scanner.next();
System.out.println(line);
}
Upvotes: 1
Views: 2097
Reputation: 17369
I don't believe the scanner splits it on line breaks, it is just your "line" variables have line breaks in them and that is why you get that output. For example, you can replace those line breaks with spaces:
(I am reading the same input text you supplied from a file, so it has some extra file reading code, but you'll get the picture.)
try {
File file = new File("assets/test.txt");
Scanner scanner = new Scanner(file);
scanner.useDelimiter("[.?!]");
while (scanner.hasNext()) {
String sentence = scanner.next();
sentence = sentence.replaceAll("\\r?\\n", " ");
// uncomment for nicer output
//line = line.trim();
System.out.println(sentence);
}
scanner.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
}
This is the result:
this is a text file
yes the deliminator works no it does not
why not
And if I uncomment the trim line, it's a bit nicer:
this is a text file
yes the deliminator works no it does not
why not
Upvotes: 5