Abhijit Bashetti
Abhijit Bashetti

Reputation: 8658

Regular Expression for Searching Duplicate Word in java

I want to find the repeating word from a given String. I want to have a regular expression to find every occurrence of a word. for example "I want to eat apple. apple is a fruit".

the regular expression should find out word "apple".

Upvotes: 2

Views: 4778

Answers (3)

karthik manchala
karthik manchala

Reputation: 13640

You can use the following to match all the duplicate words in a line.

(\\b\\w+\\b)(?=.*\\b\\1\\b)        // matches duplicates only in a single line

Edit: If you want to match duplicates in multiple lines you can use:

(\\b\\w+\\b)(?=[\\s\\S]*\\b\\1\\b)  // or the above regex with DOTALL flag

See demo for single line and demo for multiple lines

Upvotes: 1

Daniel Sperry
Daniel Sperry

Reputation: 4491

This works for multiple repetitions and multiline:

    Pattern p = Pattern.compile("\\b(\\w+)\\b(?=.*\\b(\\1)\\b)", Pattern.DOTALL);

    String s = "I want to eat apple. apple is a fruit.\r\n I really want fruit.";
    Matcher m = p.matcher(s);
    while (m.find()) {
        System.out.println("at: " + m.start(1) + " " + m.group(1));
        System.out.println("    " + m.start(2) + " " + m.group(2));
    }

It outputs:

at: 0 I
    41 I
at: 2 want
    50 want
at: 14 apple
    21 apple
at: 32 fruit
    55 fruit

Upvotes: 1

Steve Chaloner
Steve Chaloner

Reputation: 8202

This approach strips out anything that's not alphanumeric or whitespace, splits on the white space and creates a Map of the results.

Stream.of("I? want.... to eat apple    eat apple.      apple, is! a fruit".split("[^\\p{L}\\p{N}]+"))
      .collect(Collectors.groupingBy(s -> s))

Result:

a=[a], apple=[apple, apple, apple], fruit=[fruit], want=[want], eat=[eat, eat], I=[I], is=[is], to=[to]

Upvotes: 1

Related Questions