Reputation: 125
In a random string I need to find a string matching a given pattern, and put ;
after this string. I think I should use re
to do it, but I am not that familiar with it.
Example input:
this is the first part of string 1/32 part this is the second part of string
as a result, I need to put ;
after the 1/32 part
, e.g
this is the first part of string 1/32 part; this is the second part of string
I know I should use re
, and I know I should probably use re.match
with a pattern looking like [1-1000]/[1-1000]\spart
but I'm not sure where to go from here.
Edit: 1/32
is an example, it can be 65/123
, 1/3
, 6/7
Upvotes: 1
Views: 105
Reputation: 10238
Your use case is called substitution. This is exactly what the re.sub
function is for.
import re
s = "bla 1/6 part bla bla 76/88 part 12345/12345 part bla"
print(s)
s = re.sub(r'(\b\d{1,4}/\d{1,4} part)', r'\1;', s)
print(s)
The output of this is
bla 1/6 part; bla bla 76/88 part; 12345/12345 part bla
Note the missing ;
after the last occurrence of part
.
I used {}
quantifiers to limit numerator and denominator of the fractions to 4 decimal digits, which is something that you mentioned by you [1-1000]
notation. It could be even better approximated by 1?\d{1,3}
(but this is also not exact the same, it also allows for example 1999/1999
)[1].
[1]
p.s. As tripleee commented, the exact regular expression for decimal numbers ranging from 1 to 1000 is [1-9]([0-9][0-9]?)?|1000
, it looks a bit complicated, but the building pattern becomes obvious if you separate the only 4-digit number 1000
and use a superfluous pair of parentheses on the 1- to 3-digit part: [1-9]([0-9]([0-9])?)?
. Another option is to use the character class shortcut \d
for [0-9]
, resulting in [1-9]\d{0,2}|1000
.
Edit:
Upvotes: 4
Reputation:
You just have to use re.match
and re.sub
from the re
module, along with the below regex
import re
my_str = 'this is the first part of string 1/32 part this is the second part of string'
my_regex = r'(\d+/\d+\s+part)'
if re.match(my_regex, my_str):
print(re.sub(my_regex, r'\1,', my_str)) # this will print: 1/32 part,
# ...
Bare with the fact that you need to add some extra flags to the regex if you need multiple lines to match the same regex. See here a list of such flags.
You can see the regex here
A quick replacement (there might be better ways) would be to also match the parts before and after the desired matching part and do something like:
import re
my_str = 'this is the first part of string 1/32 part this is the second part of string'
my_regex = r'(.*)(\s+\d+/\d+\s+part)(.*)'
condition = re.match(my_regex, my_str)
if condition:
part = re.sub(my_regex, r'\2,', my_str)
x = condition.group(1) + part + condition.group(3)
print(x)
Which will output the modified string:
this is the first part of string 1/32 part, this is the second part of string
A simple one-line function with all of the above would be:
import re
def modify_string(my_str, my_regex):
return re.sub(my_regex, r'\1,', my_str)
if __name__ == '__main__':
print(modify_string('first part of string 1/32 part second part of string', r'(\d+/\d+\s+part)'))
But I'd recommend keeping the condition. Just in case.
Upvotes: 4