Reputation: 495
I am trying to write a regular expression that finds words starting with consonants that are near to each other.
Here is what I come up with:
\b[^aeiou0-9\W][a-z]+\s[^aeiou0-9\W][a-z]+\b
but the issue is that I need to check before and after word.
For example:
Vowels and consonants are sounds, not letters. Depending on your accent and how thinly you slice them, there are about 20 vowels and 24 consonants.
Result:
how thinly
you slice
Expected result:
how thinly
thinly you
you slice
Upvotes: 0
Views: 105
Reputation:
\b([^aeiou0-9\W][a-z]+)\s(?=([^aeiou0-9\W][a-z]+))
Example: https://regex101.com/r/tSblEq/1
This is based off of the rules in your pattern. Instead of matching <word><space><word>
directly, it's matching <word><space><lookahead word>
. It's then grouping the <word>
and the <lookahead word>
into groups 1 & 2 respectively.
Because the lookahead doesn't consume the characters, each word is evaluated individually for a following word.
Result:
Match 1 Group 1: how
Match 1 Group 2: thinly
Match 2 Group 1: thinly
Match 2 Group 2: you
Match 3 Group 1: you
Match 3 Group 2: slice
Edit to better demonstrate matching directly vs lookahead:
Direct:
how thinly you slice
^
The engine is left at the end of the match. 'thinly' cannot be evaluated.
Using lookahead:
how thinly you slice
^
Because the match only consumed "how" and the space after it, the engine is now left at the beginning of the word "thinly".
Upvotes: 1