Reputation: 650
For the below regex in python.It is giving output as 2
.But,output should be 4
.I want to find the number of occurences of vowel which has consonant before and after.But.It is skipping the next consonant if it has a vowel there.
Example: In 'lolololol'.
From the index (0,2) my condition is satisified. Then it is moving to index 3.But,I want once again regex to check from preceding index value that is from 2.How it is done is python Regex.Below is my code:
p = re.findall('[b-df-hj-np-tv-z][aeiou][b-df-hj-np-tv-z]','lolololol',re.IGNORECASE)
print(len(p))
Upvotes: 2
Views: 115
Reputation: 67978
p = re.findall('(?<=[b-df-hj-np-tv-z])[aeiou](?=[b-df-hj-np-tv-z])','lolololol',re.IGNORECASE)
print(len(p))
Use lookaheads in case matches overlap, as otherwise the characters you have already matched will not be available for the following match attempt. See demo.
https://regex101.com/r/lR1eC9/14
It has 4 matches.
Upvotes: 3
Reputation: 627082
You should understand first what your regex is doing.
It matches the first l
with [b-df-hj-np-tv-z]
, then a vowel o
with [aeiou]
, and then the following l
with [b-df-hj-np-tv-z]
. The match is found and returned. The index is at the second o
. This o
cannot be matched with [b-df-hj-np-tv-z]
, thus, the match is failed, the index is moved on to the next l
. A match is found: lol
. Then again o
cannot be matched, and then lo
is not matched as there is no final third character there.
You only need to use a look-ahead (?=[b-df-hj-np-tv-z])
instead of a [b-df-hj-np-tv-z]
so that the character is only checked and not consumed:
import re
p = re.compile(r'[b-df-hj-np-tv-z][aeiou](?=[b-df-hj-np-tv-z])')
# ^^^ ^
test_str = "lolololol"
print(p.findall(test_str))
print(len(p.findall(test_str)))
See IDEONE demo
That way, the trailing "syllable" boundary is checked, but not consumed and is available to be tested during the next regex iteration.
A must-read article about how Lookarounds Stand their Ground at rexegg.com.
Upvotes: 3