Shyam
Shyam

Reputation: 17

Using Regex to replace parts of a string in python

I have this situation where I thave strings like 1k, 300k, 500k_cleaned and replaced, etc.

I wish to use regex package to replace k with 000 and delete the rest of the characters.

My code always throws errors:

renamed=re.sub(r"\b[k]\b",'000',df_VSS[i][0])

This is the line of code I have and I would be grateful for any help.

Upvotes: 1

Views: 53

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626851

The problem is that _ and digits are word chars, so there is no word boundary between k and _ and between 1 and k.

You can match k in between a digit and a character other than an alphanumeric char:

import re
text = '1k, 300k, 500k_cleaned and replaced'
print( re.sub(r'(?<=\d)k(?![^\W_])', '000', text) )
# => 1000, 300000, 500000_cleaned and replaced

See the Python demo and the regex demo.

Details:

  • (?<=\d) - a positive lookbehind that requires a digit to appear immediately on the left
  • k - a k letter
  • (?![^\W_]) - a negative lookahead that fails the match if there is a char other than a non-word or underscore char immediately on the right (it is a \b with _ subtracted from it).

Upvotes: 1

Fonzie
Fonzie

Reputation: 186

If you have only to deal with the problem you described, a (maybe) simpler solution could be to use

s = my_str.split("k")[0] # get everything before k
s += "000"

Regexes can be tricky so I would advise you to use it only if no easier solution has been found. Also, if you use regexes, the website https://regex101.com/ can come in handy

Upvotes: 1

Related Questions