Reputation: 2182
I am looking for a very particular RegEx (or another solution, close in performance) in Python to substitute patterns, which are in the following examples:
...-1AG.,., should be transformed as ...G.,.,
..,-1A,.,., should be transformed as ..,,.,.,
...-2GTC,., should be transformed as ...C,.,
..,-2GT.,., should be transformed as ..,.,.,
...+3TAGT,, should be transformed as ...T,,
..,+3TAG.,. should be transformed as ..,.,.
Basically:
AnySymbol (not only dots and commas), followed by one +/- sign, followed by one letter digit (1..9), followed by several letters, the number of which is dependent on the previous number and finally AnySymbol (not only dots and commas),
should be transformed to:
AnySymbol (not only dots and commas) and AnySymbol (not only dots and commas).
Obviously the solution: String = re.sub(r'[\-\+]\d\w+', "", String)
is not right, if we have case (...-1AG.,., should be transformed as ...G.,.,)
.
So far I am looping over r'[\-\+]1\w', r'[\-\+]2\w\w', r'[\-\+]3\w\w\w' ... r'[\-\+]9\w\w\w\w\w\w\w\w\w'
, however I am hoping for more elegant solution. Any ideas?
Upvotes: 2
Views: 70
Reputation: 67968
Have a look at this working demo.
x="""...-1AG.,., should be transformed as ...G.,.,
..,-1A,.,., should be transformed as ..,,.,.,
...-2GTC,., should be transformed as ...C,.,
..,-2GT.,., should be transformed as ..,.,.,
...+3TAGT,, should be transformed as ...T,,
..,+3TAG.,. should be transformed as ..,.,."""
def repl(matchobj):
return matchobj.group(2)[int(matchobj.group(1)):]
print re.sub(r"[+-](\d+)([a-zA-Z]+)",repl,x)
You can use your own function in re.sub
to make customized
replacements.
Upvotes: 3