Reputation: 8274
import re
s = 'Sarah Ruthers#6'
output = re.sub("[^\\w]", "", s)
print output
The above removes ALL alpha characters; I simply want to remove any characters after the last alpha (letter type character); or trailing last alpha character for instance.
i.e. Sarah Ruthers#6
to output simply:
Sarah Ruthers
My regex above; outputs SarahRuthers
(removing the space)
Upvotes: 0
Views: 883
Reputation: 155418
Anchor your pattern at the end, and use a correct character class:
output = re.sub(r"[\W\d_]+$", "", s)
That'll remove a single run of all non-letter characters at the end of the string; the $
anchor limits the range, and [\W\d_]
properly matches non-letters, not just non-word characters (word characters include digits and the underscore character).
I also made the regex a raw string (which you should always do anyway for regex patterns), removing the need to double the backslashes.
Note that while [^a-zA-Z]
could replace [\W\d_]
for your specific case, I strongly recommend [\W\d_]
over [^a-zA-Z]
because the former is Unicode friendly, while the latter is not; for example if your text is 'résumé'
, using [^a-zA-Z]
will strip the trailing é
, [\W\d_]
won't.
Upvotes: 2
Reputation: 26039
\w
is "word character" which includes alphanumeric (letters, numbers) plus underscore (_).
Say that you only need to retain uppercase and lowercase letters towards the end:
output = re.sub("[^A-Za-z ]+$", "", s)
Upvotes: 0