Reputation: 19
I'm searching large logfiles for specific words. I've found some basic solutions on this if the String contains white spaces. But what I need is to find all occurrences of a specific word that can be surrounded by any character.
e.g. looking for "hello": "abchello" returning 1 or "##hello123...@456hello8" returning 2
I could do that with basic for loops, but I want to use mostly streams (and perhaps parallel streams) for this due to the speed gain (going thru large files).
The following seems to find any version of "hello" but it stops at the first one and goes to the next line:
bufferReader = Files.newBufferedReader(Paths.get(file));
Long count = bufferReader != null ? bufferReader.lines().filter(l -> l.matches(".*hello.*")).count() : null;
Upvotes: 0
Views: 239
Reputation: 46
Using org.apache.commons.lang3.StringUtils#countMatches:
bufferReader = Files.newBufferedReader(Paths.get(file));
Integer count = bufferReader != null ? bufferReader.lines().mapToInt(line -> StringUtils.countMatches(line, "hello")).sum() : null;
More ways to count matches: Occurrences of substring in a string
Upvotes: 2