Lyu JH
Lyu JH

Reputation: 131

Match repeated 3 or more times

This a quiz exercise

I'd like to know if a text contains words with 4 characters or more which are repeated 3 or more times in the text (anywhere in the text). If so, set one (and only one) backreference for each word.

I tried the code

(?=\b(\w{4,}+)\b.*\1)

Results returns

Test 10/39: Not working, sorry. Read the task description again. It matches notword word word

Tried

(?=(\b\w{4,}\b)(?:.*\b\1\b){2,})

Test 22/39: If a certain word is repeated many times, you're setting more than 1 backreference (common mistake, I know). You don't necessarily need to match the first occurrence of the word. Can you avoid a match in >word< word word word, and match word >word< word word? (Hint: match if it's followed by 2 occurences, don't match if it's followed by 3)

Regex demo

Upvotes: 2

Views: 2167

Answers (1)

Nick
Nick

Reputation: 147146

If I understand your question correctly, this should do what you want:

(?=(\b\w{4,}\b)(?:.*\b\1\b){2})(?!(\b\w{4,}\b)(?:.*\b\1\b){3})

It is essentially the same as your regex, looking for a word of 4 characters that is repeated, but it looks for 2 extra occurrences (so it appears 3 times). The words which match will be captured in group 1. The regex includes a negative lookahead for 3 repeats, so that it won't match the same word twice if it occurs 4 or more times.

Demo on regex101

Upvotes: 8

Related Questions