Ivaylo
Ivaylo

Reputation: 2182

Substituion of variable pattern with RegEx in Python

I am looking for a very particular RegEx (or another solution, close in performance) in Python to substitute patterns, which are in the following examples:

...-1AG.,., should be transformed as ...G.,.,
..,-1A,.,., should be transformed as ..,,.,.,
...-2GTC,., should be transformed as ...C,.,
..,-2GT.,., should be transformed as ..,.,.,
...+3TAGT,, should be transformed as ...T,,
..,+3TAG.,. should be transformed as ..,.,.

Basically:

AnySymbol (not only dots and commas), followed by one +/- sign, followed by one letter digit (1..9), followed by several letters, the number of which is dependent on the previous number and finally AnySymbol (not only dots and commas),

should be transformed to:

AnySymbol (not only dots and commas) and AnySymbol (not only dots and commas).

Obviously the solution: String = re.sub(r'[\-\+]\d\w+', "", String) is not right, if we have case (...-1AG.,., should be transformed as ...G.,.,). So far I am looping over r'[\-\+]1\w', r'[\-\+]2\w\w', r'[\-\+]3\w\w\w' ... r'[\-\+]9\w\w\w\w\w\w\w\w\w', however I am hoping for more elegant solution. Any ideas?

Upvotes: 2

Views: 70

Answers (1)

vks
vks

Reputation: 67968

Have a look at this working demo.

x="""...-1AG.,., should be transformed as ...G.,.,
..,-1A,.,., should be transformed as ..,,.,.,
...-2GTC,., should be transformed as ...C,.,
..,-2GT.,., should be transformed as ..,.,.,
...+3TAGT,, should be transformed as ...T,,
..,+3TAG.,. should be transformed as ..,.,."""

def repl(matchobj):
    return matchobj.group(2)[int(matchobj.group(1)):]

print re.sub(r"[+-](\d+)([a-zA-Z]+)",repl,x)

You can use your own function in re.sub to make customized replacements.

Upvotes: 3

Related Questions