Reputation: 1049
Problem: Find all vowels (more than 2) that are sandwiched between two consonants. These vowels can come at beginning or end of line. Example:-
input :-
abaabaabaabaae
expected output :-
['aa','aa','aa','aae']
solution Tried
import re
pattern=re.compile(r'(?:[^aeiouAEIOU])([AEIOUaeiou]{2,})(?=[^AEIOUaeiou])')
pattern.findall("abaabaabaabaae")
This gives output as ['aa','aa','aa'] , it ignores 'aae' for obvious reason as end of line is not part of search criteria. How can I include an anchor - end of line ($) inclusive search such that it($) is an OR condition in the search and not an mandatory end of line.
Upvotes: 1
Views: 442
Reputation: 110745
You can extract matches of the regular expression
re'(?<=[b-df-hj-np-tv-z])[aeiou]{2,}(?=[b-df-hj-np-tv-z]|$)'
For the following string the matches are indicated.
_abaab_aabaabaaeraaa_babaa%abaa
^^ ^^ ^^^ ^^
I found it easiest to explicitly match consonants with the character class
[b-df-hj-np-tv-z]
Upvotes: 1
Reputation: 522646
I would use re.findall
with the pattern (?<=[^\Waeiou])[aeiou]+(?![aeiou])
:
inp = "abaabaabaabaae"
matches = re.findall(r'(?<=[^\Waeiou])[aeiou]+(?![aeiou])', inp, flags=re.IGNORECASE)
print(matches)
This prints:
['aa', 'aa', 'aa', 'aae']
Here is an explanation of the regex pattern:
(?<=[^\Waeiou]) assert that what precedes is any word character, excluding a vowel
this also exlcudes the start of the input
[aeiou]+ match one or more vowel characters
(?![aeiou]) assert that what follows is not a vowel (includes end of string)
Upvotes: 0