Reputation: 14836
I need to replace all occurrences of "W32 L30" with "W32in L30in" in a large corpus of text. The numbers after W, L also vary.
I thought of using this regex expressions
[W]([-+]?\d*\.\d+|\d+)
[L]([-+]?\d*\.\d+|\d+)
But these would only find the number after each W and L, so it's still laborious and very time consuming to replace every occurrence so I was wondering if there's a way to do this directly in regex.
Upvotes: 2
Views: 65
Reputation: 476709
You can use a capture group and simplify the regex. Next we can then use a backref to do the replacement. Like:
import re
RGX = re.compile(r'([WL]([-+]?\d*\.\d+|\d+))(in)?')
result = RGX.sub(r'\1in', some_string)
The \1
is used to reference the first capture group: the result of the string we capture with [WL]([-+]?\d*\.\d+|\d+)
. The last part (in)?
optionally also matches the word in
, such that in case there is already an in
, we simply replace it with the same value.
So if some_string
is for instance:
>>> some_string
'A W2 in C3.15 where L2.4in and a bit A4'
>>> RGX.sub(r'\1in', some_string)
'A W2in in C3.15 where L2.4in and a bit A4'
Upvotes: 2