Leander
Leander

Reputation: 21

How to find consonant clusters with regex?

I want to find consonant clusters with regex. An example of a cluster is mpl in examples.

To start, I filtered out all the vowels and replaced them with spaces. With vowels filtered out, examples is x mpl s.

How can I filter out the x and the s too?

Upvotes: 2

Views: 3501

Answers (3)

J0e3gan
J0e3gan

Reputation: 8938

Since your working definition of "consonant cluster" is two or more consonants in succession, you can simply use the following pattern (case-insensitively if you want to handle capital consonants):

[bcdfghjklmnpqrstvwxyz]{2,}
  • [bcdfghjklmnpqrstvwxyz] – a simple whitelist character class for consonants (i.e. that will only match a consonant)
  • {2,} – two or more in succession

You can test the pattern against a couple input strings in a related regex fiddle.

Note that since vowels are "a, e, i, o, u, and sometimes y", I have included y in the whitelist character class for consonants above.

You could drop y and use...

[bcdfghjklmnpqrstvwxz]{2,}

...if you want to unconditionally treat y as a vowel rather than a consonant; but the rules for when y is a consonant are a bit more complicated than a simple regex will handle (basically requiring that you identify syllables first, then y's location within them).

Upvotes: 1

Abecee
Abecee

Reputation: 2393

Turning a comment into an answer…

As you changed vowels into white space: Search for \b.\b (or \b\w\b to target a bit better) and replace with a blank - to get rid of all isolated letters, leaving you with sequences of at least two.

Like RegEx101.

Upvotes: 0

Avinash Raj
Avinash Raj

Reputation: 174776

Seems like you want something like this,

(?:(?![aeiou])[a-z]){2,}

(?![aeiou])[a-z] means choose any character from the lowercase alphabets but not of a or e or i or o or u

DEMO

  • (?![aeiou])[a-z] Matches a lowercase consonent

  • (?:(?![aeiou])[a-z]){2,} two or more times.

Upvotes: 1

Related Questions