Rafael Martínez
Rafael Martínez

Reputation: 335

Extracting text between two defined variables

I have the following code:

import re
p = 1
while p < 10 and p>= 1:
    p = p+ 1
    primer = ("Artículo %so" % (p-1))
    ultimo = ("Artículo %so" % (p))
    with open("LISR.txt") as ley:
        texto_original = ley.read()
        fragmento = str(re.findall((r'primer(.*?)ultimo', texto_original, re.DOTALL))

I have a problem with the last line of code. I would like to extract the text between the two variables called primer and ultimo. The problem is that regex uses those words as strings, not variables. So I tried the following:

fragmento = str(re.findall((r'%s(.*?)%s' % primer, ultimo), texto_original, re.DOTALL))

which throws me the following error:

TypeError: not enough arguments for format string

How should I fix this?

Upvotes: 0

Views: 37

Answers (2)

CDspace
CDspace

Reputation: 2689

From the docs (emphasis mine)

If format requires a single argument, values may be a single non-tuple object. [5] Otherwise, values must be a tuple with exactly the number of items specified by the format string, or a single mapping object (for example, a dictionary)

So you simply need to wrap your variables in parentheses, like so, to match your two %s formatting arguments

(r'%s(.*?)%s' % ( primer, ultimo ) )
                ^                ^

Upvotes: 1

Scott Hunter
Scott Hunter

Reputation: 49803

The interpreter is confused as to which arguments go to findall and which are to be used in formatting a string:

fragmento = str(re.findall((r'%s(.*?)%s' % (primer, ultimo)), texto_original, re.DOTALL))

Upvotes: 0

Related Questions