Reputation: 8658
I want to find the repeating word from a given String.
I want to have a regular expression to find every occurrence of a word.
for example "I want to eat apple. apple is a fruit"
.
the regular expression should find out word "apple"
.
Upvotes: 2
Views: 4778
Reputation: 13640
You can use the following to match all the duplicate words in a line.
(\\b\\w+\\b)(?=.*\\b\\1\\b) // matches duplicates only in a single line
Edit: If you want to match duplicates in multiple lines you can use:
(\\b\\w+\\b)(?=[\\s\\S]*\\b\\1\\b) // or the above regex with DOTALL flag
See demo for single line and demo for multiple lines
Upvotes: 1
Reputation: 4491
This works for multiple repetitions and multiline:
Pattern p = Pattern.compile("\\b(\\w+)\\b(?=.*\\b(\\1)\\b)", Pattern.DOTALL);
String s = "I want to eat apple. apple is a fruit.\r\n I really want fruit.";
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println("at: " + m.start(1) + " " + m.group(1));
System.out.println(" " + m.start(2) + " " + m.group(2));
}
It outputs:
at: 0 I
41 I
at: 2 want
50 want
at: 14 apple
21 apple
at: 32 fruit
55 fruit
Upvotes: 1
Reputation: 8202
This approach strips out anything that's not alphanumeric or whitespace, splits on the white space and creates a Map
of the results.
Stream.of("I? want.... to eat apple eat apple. apple, is! a fruit".split("[^\\p{L}\\p{N}]+"))
.collect(Collectors.groupingBy(s -> s))
Result:
a=[a], apple=[apple, apple, apple], fruit=[fruit], want=[want], eat=[eat, eat], I=[I], is=[is], to=[to]
Upvotes: 1