Using regex compile through loop in Python

Question

I have a text which I want to match through words in a given set. After matching it will simply tag them. The code is this

mytext = "xxxxx repA1 yyyy REPA1 zzz."
geneset = {'leuB', 'repA1'} # The actual length is ~1Million entries

result = mytext
for gene in geneset:
    regexp = re.compile(gene, flags=re.IGNORECASE)
    result = re.sub(regexp, r'\g<0>', mytext)

print result

The expected output is:

xxxxx repA1 yyyy REPA1 zzz.

But why the code above failed to generate the results?

soloidx · Accepted Answer

In your code, you are using the re.sub over the original text (that no are changing in each loop), if you use instead the result variable like result = re.sub(regexp, r'\g<0>', result) the output will be correct.

Using regex compile through loop in Python

Answers (2)

Related Questions