Reputation: 17
I have this situation where I thave strings like 1k
, 300k
, 500k_cleaned and replaced
, etc.
I wish to use regex package to replace k
with 000
and delete the rest of the characters.
My code always throws errors:
renamed=re.sub(r"\b[k]\b",'000',df_VSS[i][0])
This is the line of code I have and I would be grateful for any help.
Upvotes: 1
Views: 53
Reputation: 626851
The problem is that _
and digits are word chars, so there is no word boundary between k
and _
and between 1
and k
.
You can match k
in between a digit and a character other than an alphanumeric char:
import re
text = '1k, 300k, 500k_cleaned and replaced'
print( re.sub(r'(?<=\d)k(?![^\W_])', '000', text) )
# => 1000, 300000, 500000_cleaned and replaced
See the Python demo and the regex demo.
Details:
(?<=\d)
- a positive lookbehind that requires a digit to appear immediately on the leftk
- a k
letter(?![^\W_])
- a negative lookahead that fails the match if there is a char other than a non-word or underscore char immediately on the right (it is a \b
with _
subtracted from it).Upvotes: 1
Reputation: 186
If you have only to deal with the problem you described, a (maybe) simpler solution could be to use
s = my_str.split("k")[0] # get everything before k
s += "000"
Regexes can be tricky so I would advise you to use it only if no easier solution has been found. Also, if you use regexes, the website https://regex101.com/ can come in handy
Upvotes: 1