Joff
Joff

Reputation: 12187

python re.sub single or multiple characters

I have a lot of strings in the form of

100XX 123XX 1XX 234XXXXX and I would like to replace all the X with 0's. There is other text in the string too in the form of an address.

234XX N. Somestreet Anytown, USA I can't be sure if numbers followed by X doesn't appear anywhere else so I cannot just replace the X's

I have this code so far but it is only dropping in a single 0 and I need it to drop in a variable amount of 0's..

re.sub(r"([0-9]+)([X]+)", r"\g<1>0", "234XX")

which will give me 2340...I need it to return 23400 or if given 123XXX I need it to return 123000

Upvotes: 0

Views: 2345

Answers (3)

Joff
Joff

Reputation: 12187

what I ended up doing was making a callable and passing that to re.sub

def sub_0_for_x(match):
    old = match.groups()
    return old[0] + "0" * len(match[1])

re.sub("([0-9]+)([0]+)", sub_0_for_x, "123XX Anyplace, USA")

Upvotes: 0

Darkstarone
Darkstarone

Reputation: 4730

What I'd do is use finditer to return MatchObjects of your regex, you can then access functions like start() and end() to rebuild your string. Since this is a direct replace, you can do this in place without worrying about index issues.

import re

res = '234XX N. Somestreet Anytown, USA\n234XXXXXX N. Somestreet Anytown, USA\nXXXXXXXXXX'

for match in re.finditer(r"([0-9]+)([X]+)", res):
    print(match.group(1))
    print(len(match.group(2)))
    # res = res[:match.end(1)] + ('0' * len(match.group(2))) + res[match.end():]
    res = res[:match.end(1)] + match.group(2).replace('X','0') + res[match.end():]

print(res)

Upvotes: 1

Sebastian Proske
Sebastian Proske

Reputation: 8413

You can use a callback function to get your desired result, see http://ideone.com/ccB37k

import re

def repl(m):
    return (m.group(1) + m.group(2).replace('X','0'))

str = '234XX N. Somestreet Anytown, USA'
pattern = r'\b(\d+)(X+)\b'
print(re.sub(pattern, repl, str))

Upvotes: 2

Related Questions