Reputation: 13
I am a Java developer, and new to Python. I would like to define a regex accepting all the alphabetic characters except for some of them. I want to exclude just the vowels and the character 'y', be it in upper- or lowercase.
The regex in Java for it would be as follows:
"[a-zA-Z&&[^aeiouyAEIOUY]]"
How can I (re)define it as in Python? The above doesn't work for Python, obviously. And I also would NOT like the following pattern to be suggested:
"[bcdfghjklmnpqrstvwxzBCDFGHJKLMNPQRSTVWXZ]"
Upvotes: 0
Views: 544
Reputation: 40753
I don't think the current python regular expression module has exactly what you're looking for. The eventual replacement regex
does have what you need, and you can install it should you wish.
Other than that, a negation might be the way to go. Basically, define all the characters you don't want and then invert that. Sounds labourious, but the "not-word" shorthand (\W
) can help us out. \w
means a-zA-Z0-9_
(for ASCII matches), and \W
means the opposite ([^\w]
). Thus, [aeiouyAEIOUY\W\d_]
means every character which you are not looking for, and so [^aeiouyAEIOUY\W\d_]
means every character you are looking for. eg.
>>> import re
>>> s = "xyz_ d10 word"
>>> pattern = "[^aeiouyAEIOUY\W\d_]+"
>>> re.findall(pattern, s)
['x', 'z', 'd', 'w', 'rd']
If you are strictly after only ASCII characters then you can use the ASCII
flag. eg.
>>> s = "Español"
>>> re.findall(pattern, s)
['sp', 'ñ', 'l']
>>> re.findall(pattern, s, re.ASCII)
['sp', 'l']
Upvotes: 2
Reputation: 17332
(?=...) Positive lookahead assertion. This succeeds if the contained regular expression, represented here by ..., successfully matches at the current location, and fails otherwise. But, once the contained expression has been tried, the matching engine doesn’t advance at all; the rest of the pattern is tried right where the assertion started.
(?!...) Negative lookahead assertion. This is the opposite of the positive assertion; it succeeds if the contained expression doesn’t match at the current position in the string.
r"(?![aeiouyAEIOUY])[a-zA-Z])"
Upvotes: 0