user1838663
user1838663

Reputation: 35

Python re.sub appends string instead of replacing it

I am new to Python and am stuck on a regular expression replacement. I am parsing a settings file which has statements such as:

set fmri(convolve7) 3

The settings file is used as a template. The script parses through the template and writes a new settings file with updated settings.

The main structure of my program is

for line in infile:
 if condition = true
  for each in listofelements
    if re.search(each, line):
     print(re.sub(r"\d+", "0", line), file=outfile, end='') # double output
 if re.search(somethingelse, line):
  print(re.sub(templatesubid, i, line), file=outfile, end='')# normal substitution

etc.

The substitution in the for loop results in double output, wheras outside the for loop it does't. The for loop seems to inserts a newline with the correct substitution string, i.e.

set fmri(convolve7) 0
set fmri(convolve7) 3

The other substitions work as expected, wheras it is the same code. Can it be that the for loop causes this double output?

Upvotes: 0

Views: 554

Answers (1)

Blckknght
Blckknght

Reputation: 104802

It looks like the relevant code is at the bottom:

    for line in infile:
        if len(subparamlist) > 0:
            for j in subparamlist:
                query = j.replace(")", "\)").replace("(", "\(")
                if re.search(query, line):
                    print(re.sub(r"\d+", "0", line), file=outfile, end='') #trouble!
        if re.search(templatesubid, line) and re.search('feat_files\(', line) or re.search('\(custom', line) : # correct the subjectID
            print(re.sub(templatesubid, i, line), file=outfile, end='')
        elif re.search(str(nptslist[2]), line): # correct the number of timepoints
            print(re.sub(str(nptslist[2]), str(nvols[0]), line), file = outfile, end='')
        else: 
            print(line, file=outfile, end='') # if nothing to do, just copy the line to the new text file.

I think the problem is that you're printing in both the top if statement (substituting 0 into the line), and then printing again in one of the branches of the if/elif/else block below it. This result is some (or all) lines being doubled.

I didn't actually understand the code well enough to work out an appropriate fix, but a possible start might be to change the if you've commented with "correct the subjectID" to an elif.

Upvotes: 1

Related Questions