Reputation: 43
THIS is what I have thus far:
match = re.sub(r'[0-9]',"","th1s n33ds to be r3m0v3d and this 2 doesnt")
This now will remove ALL the numbers throughout the sentence, I tried everything. Does anyone have an idea around this?
Much appreciated
Upvotes: 0
Views: 337
Reputation: 11524
You can use \B
:
>>> re.sub(r'\B[0-9]+\B',"","th1s n33ds to be r3m0v3d and this 2 doesnt")
ths nds to be rmvd and this 2 doesnt
Translation from regex into english: remove all digits sequences that are located inside of the word.
\B - Matches the empty string, but only when it is not at the beginning or end of a word.
EDIT: if digits can start or end the word then this regex will do:
>>> re.sub(r'([0-9]+(?=[a-z])|(?<=[a-z])[0-9]+)',"","1th1s n33ds to be r3m0v3d and this 2 doesnt3")
ths nds to be rmvd and this 2 doesnt
Translation from regex into english: remove all digits that are followed or preceded by a letter. This second regex is pretty ugly and I'm sure there is a better way.
Upvotes: 2
Reputation: 5473
This works -
re.sub(r'(?:[a-zA-Z]*[0-9]+[a-zA-Z]+)|(?:[a-zA-Z]+[0-9]+[a-zA-Z]*)',"","th1s n33ds to be r3m0v3d and this 2 doesnt this2")
# output
' to be and this 2 doesnt '
Upvotes: 0