Reputation: 23
I need to remove any 'h' in a string if it comes after a vowel.
E.g.
John -> Jon
Baht -> Bat
Hot -> Hot (no change)
Rhythm -> Rhythm (no change)
Finding the words isnt a problem, but removing the 'h' is as I still need the original vowel. Can this be done in one regex?
Upvotes: 0
Views: 108
Reputation: 962
The regex for matching h
after a vowel would be a positive lookbehind one
(?<=a|e|y|u|o|a)h
And you can do
re.sub(r"([a-zA-Z]*?)(?<=a|e|y|u|o|a)h([a-zA-Z]*)",r"\1\2",s)
However, if you can have more than one h
after a vowel in a string, you would need to do several iterations, since regex doesn't support dynamic matching groups
import re
s = "bahtbaht"
s1 = s
while True:
s1 = re.sub(r"([a-zA-Z]*?)(?<=a|e|y|u|o|a)h([a-zA-Z]*)",r"\1\2",s)
if len(s1) == len(s):
break
s = s1
print(s1)
In a more proper form, using function for repl
import re
def subit(m):
match, = m.groups()
return match
s = "bahtbaht"
print(re.sub(r"([a-zA-Z]*?)(?:(?<=a|e|y|u|o|a)h|$)",subit,s))
A much simplier answer, thanks to @tobias_k
re.sub(r"([aeiou])h", r"\1", s, flags = re.I)
Upvotes: 2