1547kime
1547kime

Reputation: 31

Process a file word-by-word using Stream<T>

I'm learning to use Stream<String>, and trying to get all words that contain vowels and the length of the word is greater than 4 in a file without Scanner.hasNext().

example text in a file

For the example of the file, I wanna write code like

Stream<String> text = Files.lines(Paths.get(example.txt));
List<String> result = text.filter(w->w.length()>4)
.filter(w -> w.contains("a")||w.contains("e")||
w.contains("i")||w.contains("o")||w.contains("u")).collect(Collectors.toList());
System.out.println(result);

The output that I want to get is

There bunch vowels example vowel

But it returns the same string as the text.

All I know is to read line by line in a text file with using Stream<String>, but I want to make it read word by word (or split strings from each line.)

How can I do this?

Upvotes: 2

Views: 152

Answers (1)

Prasanna
Prasanna

Reputation: 2488

You can try the below snippet

List<String> result = Files.lines(Paths.get("/tmp/examples.txt"))
                                   .flatMap(line -> Arrays.stream(line.split("\\W+")))
                                   .filter(w -> w.length() > 4)
                                   .filter(w -> w.matches(".*[aeiou].*"))
                                   .collect(Collectors.toList());

System.out.println(result);

Regex to split the word: "\\W+" : one or more sequences of non-word character.

Note:
The problem with this approach is the word foo'sbarwill be split into 2 words foo and sbar. If you want to exclude ' as a splitter, you can use the pattern [\W&&[^']]+. Please include all the valid characters this way in the expression.

Output:

[There, bunch, vowels, example, vowel]

Upvotes: 3

Related Questions