Reputation: 49
I am trying to read the words of a file into a stream and the count the number of times the word "the" appears in the file. I cannot seem to figure out an efficient way of doing this with only streams.
Example: If the file contained a sentence such as: "The boy jumped over the river." the output would be 2
This is what I've tried so far
public static void main(String[] args){
String filename = "input1";
try (Stream<String> words = Files.lines(Paths.get(filename))){
long count = words.filter( w -> w.equalsIgnoreCase("the"))
.count();
System.out.println(count);
} catch (IOException e){
}
}
Upvotes: 1
Views: 2199
Reputation: 124235
Just line name suggests Files.lines
returns stream of lines not words. If you want to iterate over words I you can use Scanner
like
Scanner sc = new Scanner(new File(fileLocation));
while(sc.hasNext()){
String word = sc.next();
//handle word
}
If you really want to use streams you can split each line and then map your stream to those words
try (Stream<String> lines = Files.lines(Paths.get(filename))){
long count = lines
.flatMap(line->Arrays.stream(line.split("\\s+"))) //add this
.filter( w -> w.equalsIgnoreCase("the"))
.count();
System.out.println(count);
} catch (IOException e){
e.printStackTrace();//at least print exception so you would know what wend wrong
}
BTW you shouldn't leave empty catch blocks, at least print exception which was throw so you would have more info about problem.
Upvotes: 1
Reputation: 151
You could use Java's StreamTokenizer for this purpose.
import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.StreamTokenizer;
import java.io.InputStreamReader;
import java.nio.charset.StandardCharsets;
public class Main {
public static void main(String[] args) throws IOException {
long theWordCount = 0;
String input = "The boy jumped over the river.";
try (InputStream stream = new ByteArrayInputStream(
input.getBytes(StandardCharsets.UTF_8.name()))) {
StreamTokenizer tokenizer =
new StreamTokenizer(new InputStreamReader(stream));
int tokenType = 0;
while ( (tokenType = tokenizer.nextToken())
!= StreamTokenizer.TT_EOF) {
if (tokenType == StreamTokenizer.TT_WORD) {
String word = tokenizer.sval;
if ("the".equalsIgnoreCase(word)) {
theWordCount++;
}
}
}
}
System.out.println("The word 'the' count is: " + theWordCount);
}
}
Upvotes: 0