efnx
efnx

Reputation: 73

Regular Expression for x vowels in word of n length

I'm trying to write a regular expression in Java that will match a word of n length that has a at least x number of vowels in it.

So far I've come up with the following:

// match words that are length 10 and have at least 2 vowels in them
(?=\w{10})(?:[^aeiou\W]*[aeiuo]){2}\w+

This seems to work but also matches words greater than length 10, i.e.:

wildernesses - matches

volatilizations - matches

voiceprint - matches (this should be the only match)

I would like it so that the length=10 constraint is enforced. I suspect that it may have something to do with the fact that I'm adding letters (the vowels) to the length of the string, but I'm not certain. Any help / guidance will be appreciated.

Upvotes: 3

Views: 2109

Answers (3)

jrreid
jrreid

Reputation: 81

Try this out... (?<=\b|\p{Punct})(?:(?i)(?:aeiou{2,})|(?:a-z&&[^aeiou]{3,}))(?<=\w{10})

Tested this against sample data which seems to work. In my example, I've accounted for punctuation.

Upvotes: 0

Bohemian
Bohemian

Reputation: 425033

You can simplify greatly by using a simple lookahead (as a java String):

"(?i)\\b(?=([^aeiou ]*[aeiou]){2,})[a-z]{10}\\b"

Note that all other answers use \w for letters, but \w includes the underscore character, which is not a letter.

(?i) turns on case insensitivity.

Upvotes: 2

Eric
Eric

Reputation: 97571

Use word boundaries, \b, to prevent the match happening halfway through a word:

\b(?=\w{10}\b)(?:[^aeiou\W]*[aeiuo]){2,}[^aeiou\W]*\b

This will match:

wildernesses voiceprint volatilizations

Upvotes: 3

Related Questions