Brian
Brian

Reputation: 14836

Replace particular strings in python

I need to replace all occurrences of "W32 L30" with "W32in L30in" in a large corpus of text. The numbers after W, L also vary.

I thought of using this regex expressions

[W]([-+]?\d*\.\d+|\d+)
[L]([-+]?\d*\.\d+|\d+)

But these would only find the number after each W and L, so it's still laborious and very time consuming to replace every occurrence so I was wondering if there's a way to do this directly in regex.

Upvotes: 2

Views: 65

Answers (1)

willeM_ Van Onsem
willeM_ Van Onsem

Reputation: 476709

You can use a capture group and simplify the regex. Next we can then use a backref to do the replacement. Like:

import re

RGX = re.compile(r'([WL]([-+]?\d*\.\d+|\d+))(in)?')
result = RGX.sub(r'\1in', some_string)

The \1 is used to reference the first capture group: the result of the string we capture with [WL]([-+]?\d*\.\d+|\d+). The last part (in)? optionally also matches the word in, such that in case there is already an in, we simply replace it with the same value.

So if some_string is for instance:

>>> some_string
'A W2 in C3.15 where L2.4in and a bit A4'
>>> RGX.sub(r'\1in', some_string)
'A W2in in C3.15 where L2.4in and a bit A4'

Upvotes: 2

Related Questions