Lofi Peng
Lofi Peng

Reputation: 19

Calculate number of word occurrences in Stream<String> with characters in front or behind

I'm searching large logfiles for specific words. I've found some basic solutions on this if the String contains white spaces. But what I need is to find all occurrences of a specific word that can be surrounded by any character.

e.g. looking for "hello": "abchello" returning 1 or "##hello123...@456hello8" returning 2

I could do that with basic for loops, but I want to use mostly streams (and perhaps parallel streams) for this due to the speed gain (going thru large files).

The following seems to find any version of "hello" but it stops at the first one and goes to the next line:

bufferReader = Files.newBufferedReader(Paths.get(file));
Long count = bufferReader != null ? bufferReader.lines().filter(l -> l.matches(".*hello.*")).count() : null;

Upvotes: 0

Views: 239

Answers (1)

FogZ
FogZ

Reputation: 46

Using org.apache.commons.lang3.StringUtils#countMatches:

bufferReader = Files.newBufferedReader(Paths.get(file));
Integer count = bufferReader != null ? bufferReader.lines().mapToInt(line -> StringUtils.countMatches(line, "hello")).sum() : null;

More ways to count matches: Occurrences of substring in a string

Upvotes: 2

Related Questions